No video

C++ Weekly - Ep 430 - How Short String Optimizations Work

  Рет қаралды 14,658

C++ Weekly With Jason Turner

C++ Weekly With Jason Turner

Күн бұрын

☟☟ Awesome T-Shirts! Sponsors! Books! ☟☟
Upcoming Workshop: C++ Best Practices, NDC TechTown, Sept 9-10, 2024
► ndctechtown.co...
Upcoming Workshop: Applied constexpr: The Power of Compile-Time Resources, C++ Under The Sea, October 10, 2024
► cppunderthesea...
CLion is a cross-platform JetBrains IDE for C and C++ with:
- A smart C and C++ editor to navigate and maintain your code base productively.
- Code analysis with quick-fixes to identify and fix bugs and style inconsistencies.
- An integrated debugger - along with other essential tools from the ecosystem - available
straight out of the box.
- And much more!
jb.gg/clion_ide code: CppWeeklyCLion
Episode details: github.com/lef...
T-SHIRTS AVAILABLE!
► The best C++ T-Shirts anywhere! my-store-d16a2...
WANT MORE JASON?
► My Training Classes: emptycrate.com/...
► Follow me on twitter: / lefticus
SUPPORT THE CHANNEL
► Patreon: / lefticus
► Github Sponsors: github.com/spo...
► Paypal Donation: www.paypal.com...
GET INVOLVED
► Video Idea List: github.com/lef...
JASON'S BOOKS
► C++23 Best Practices
Leanpub Ebook: leanpub.com/cp...
► C++ Best Practices
Amazon Paperback: amzn.to/3wpAU3Z
Leanpub Ebook: leanpub.com/cp...
JASON'S PUZZLE BOOKS
► Object Lifetime Puzzlers Book 1
Amazon Paperback: amzn.to/3g6Ervj
Leanpub Ebook: leanpub.com/ob...
► Object Lifetime Puzzlers Book 2
Amazon Paperback: amzn.to/3whdUDU
Leanpub Ebook: leanpub.com/ob...
► Object Lifetime Puzzlers Book 3
Leanpub Ebook: leanpub.com/ob...
► Copy and Reference Puzzlers Book 1
Amazon Paperback: amzn.to/3g7ZVb9
Leanpub Ebook: leanpub.com/co...
► Copy and Reference Puzzlers Book 2
Amazon Paperback: amzn.to/3X1LOIx
Leanpub Ebook: leanpub.com/co...
► Copy and Reference Puzzlers Book 3
Leanpub Ebook: leanpub.com/co...
► OpCode Puzzlers Book 1
Amazon Paperback: amzn.to/3KCNJg6
Leanpub Ebook: leanpub.com/op...
RECOMMENDED BOOKS
► Bjarne Stroustrup's A Tour of C++ (now with C++20/23!): amzn.to/3X4Wypr
AWESOME PROJECTS
► The C++ Starter Project - Gets you started with Best Practices Quickly - github.com/cpp...
► C++ Best Practices Forkable Coding Standards - github.com/cpp...
O'Reilly VIDEOS
► Inheritance and Polymorphism in C++ - www.oreilly.co...
► Learning C++ Best Practices - www.oreilly.co...

Пікірлер: 51
@cppweekly
@cppweekly 2 ай бұрын
There were a couple of bugs in the implementation that GCC let slide - mostly around the use of initializing / assigning C-style arrays. Thanks to a viewer for pointing that out. Here's an updated version that works on GCC/Clang/MSVC compiler-explorer.com/z/3ecYaKK8e
@nyyakko
@nyyakko 2 ай бұрын
I think this has to be one of the coolest examples on how and why to use constexpr.
@patrixonon
@patrixonon 2 ай бұрын
Yep, i've been writing my own custom string class implementation and i had to debug such cases at runtime which take me quite some time to make it right, whereas constexpr simply allow us to see the problem at compile time, which is very useful.
@mCoding
@mCoding 2 ай бұрын
Top notch! You did a great job showing off the core ideas of SSO/SOO that can be adapted to any dev's situation while not getting bogged down by hyper-optimizations.
@negidrums915
@negidrums915 2 ай бұрын
constexpr is essentially a compile-time UB sanitizer😮
@cppweekly
@cppweekly 2 ай бұрын
constexpr all the things!
@sirhenrystalwart8303
@sirhenrystalwart8303 Ай бұрын
I'm way more sold on the benefit of constexpr/consteval after seeing this.
@cppweekly
@cppweekly Ай бұрын
Yeah, I'm still refining the way I present this stuff. This is a really good argument for it!
@Xilefian
@Xilefian 2 ай бұрын
A good exercise is implementing small object optimised vector
@Raspredval1337
@Raspredval1337 2 ай бұрын
I've just googled this topic today! PS: one could pack the small string even tighter, there's a way to use single byte for a size, but you'd store the max_small_string_capacity minus size instead. When the small string size equals the max capacity, the capacity minus size becomes zero and acts as a null terminator. 🤯
@higaski
@higaski 2 ай бұрын
That sounds like UB, given that the compiler is free to add padding to its liking.
@garyp.7501
@garyp.7501 2 ай бұрын
@@higaski This trick that @user-cy1m5vb71 uses, could be put into yet another union, and thus you'd know about the packing.
@Raspredval1337
@Raspredval1337 2 ай бұрын
struct string { struct s_large { char* data; // assuming it's a 64bit pointer uint64_t size; // maybe even some padding to add extra stack space // like char padding[8]; }; struct s_small { static constexpr max_capacity = sizeof(s_large); char data[sizeof(heap) - 1]; char max_capacity_minus_size; }; uint64_t capacity; union { s_large l; s_small s; } data; }; even if there's some alignment padding added to the members of the s_large, the s_small should still be fine
@UsernameUsername0000
@UsernameUsername0000 2 ай бұрын
@@Raspredval1337 I don’t get this approach. Won’t requesting c_str for a small string not be guaranteed-null-terminated?
@Raspredval1337
@Raspredval1337 2 ай бұрын
@@UsernameUsername0000if the whole small string storage is zero initialized, then it will be. Plus, when the small string size would be equal to it's capacity, the last byte (capacity_minus_size) would be equal to zero, thus acting as a null terminator
@hwstar9416
@hwstar9416 2 ай бұрын
honestly I'd rather if small string was a different type, maybe something like: small_string str;
@ohwow2074
@ohwow2074 2 ай бұрын
I don't know why they didn't choose to do that in the first place. Just let the user specify the size of the small buffer based on their needs. Now we're limited to a 16 byte buffer.
@mjKlaim
@mjKlaim 2 ай бұрын
I would have made `is_small_storage()` a function which returns either `m_size < ` an arbitrary constant or the result of `m_size < sizeof(small_storage)` (so no additional storage needed for the bool). In several functions we would then need to re-evaluate this but I think it's worth it as it takes less space.
@antonpieper
@antonpieper 2 ай бұрын
Also, a "problem" with this and the video version is that the assignment operator would need to deallocate the memory if you assign a small string to a previously large string, because you would loose the pointer location
@mjKlaim
@mjKlaim 2 ай бұрын
@@antonpieper Yes the check has to be done for any of the modifying operations I guess.
@Mozartenhimer
@Mozartenhimer 2 ай бұрын
I think the real optimization would be just to make the most significant bit of the size a flag
@mjKlaim
@mjKlaim 2 ай бұрын
@@Mozartenhimer Doesnt that reduce the max size of the container? It's an undefined integer type so all the bits are used.
@Mozartenhimer
@Mozartenhimer 2 ай бұрын
@@mjKlaim I don't think you'll miss that 9.2 exabytes.
@lkedvenc6898
@lkedvenc6898 2 ай бұрын
I have implemented my own string class and the idea was quite similar. The regular std::string ruins the heap if you use it very frequently. Of course my string was better. ;-)
@cyrilemeka6987
@cyrilemeka6987 Ай бұрын
Did you implement a copy on write string?
@FedericoPekin
@FedericoPekin 2 ай бұрын
awesome one!
@Subdest
@Subdest 2 ай бұрын
Why Boolean needed to check if it is a small object? Size is present. Max size of small object also is known. I doubt that checking Boolean is much faster then checking if some value is less then a constant…
@hampus23
@hampus23 2 ай бұрын
Well, if you shrink the heap allocated string you don't reallocate so that won't work but there are better ways to do this.
@hampus23
@hampus23 2 ай бұрын
Raymond Chen has written a great post (Inside STL: The string) about this.
@garyp.7501
@garyp.7501 2 ай бұрын
If the string is shrunk, ie a short string is put into a what was a long string, you'd have to be sure to deallocate the extra space. That might be surprising to some users. Ie, long string, ... allocate short string ... deallocate long string ... allocate again. Vs long string .. allocate short string, .... no deallocation long but not too long string ... also no reallocation
@Raspredval1337
@Raspredval1337 2 ай бұрын
size can be zero even if it's a heap-allocated string. But capacity does reflect it's nature tho, you can use that, as long as heap allocated capacity is always larger than the max small string capacity
@cppweekly
@cppweekly 2 ай бұрын
Because it's possible I grew the string bigger than small object size, allocated, then shrunk the string later and don't want to copy data / deallocate storage.
@PaulMetalhero
@PaulMetalhero 2 ай бұрын
Would love to see a std::variant version
@cyrilemeka6987
@cyrilemeka6987 Ай бұрын
Maybe to avoid std::visit and lambda pain?
@garyp.7501
@garyp.7501 2 ай бұрын
To pack this object into something even smaller in space you could use a bit field for the m_size, and while you can't allocate a string that is unsigned int64, you can allocate a unsigned int63. And use a bit at either end to indicate whether the space was allocated on the heap or using your m_small field of your other union. There of course is a minimal overhead of looking at those bit fields vs a bool or the int, but this is a thing that maps doing red/black do.
@cyrilemeka6987
@cyrilemeka6987 Ай бұрын
The overhead of bit manipulation should be minimal or non-existent. Computers leave and breathe for those operations, I implemented a u8char class that unfortunately doesn't cache the calculated unicode code-point due to memory concerns, but retrieving the unicode code points from the utf-8 encoded text is a relatively fast operation due to switching over to bit manipulation instead of using std::string and std::bitset which were really costly operations.
@cyrilemeka6987
@cyrilemeka6987 Ай бұрын
Oh and he won't be able to use the bits on either end, the most significant bit is more apt.
@anon_y_mousse
@anon_y_mousse 2 ай бұрын
The only thing I dislike here, as usual, is all of the boilerplate code. However, if they incorporate such a string type into the standard then that goes away because you won't have to implement it yourself, and in such a case I would have no complaints.
@TheRobbix1206
@TheRobbix1206 2 ай бұрын
I think I would have use the capacity has the common field instead of the size because if for example you modify the length of your string a lot between let say 12 and 20 characters your solution force you to allocate deallocate each time you cross the 16 byte boundary. Whereas the capacity can stay at ~20 while your size can move around as it want. To use capacity instead we can say that size() = capacity when capacity < 16 else size and capacity() = max(16,capacity)
@UsernameUsername0000
@UsernameUsername0000 2 ай бұрын
Isn’t std::size_t in not ?
@ohwow2074
@ohwow2074 2 ай бұрын
Yeah
@cppweekly
@cppweekly 2 ай бұрын
I often get that one wrong - sorry!
@Raspredval1337
@Raspredval1337 2 ай бұрын
btw, offtopic question: since you really shouldn't store a function address in a void pointer, it's recommended to use a dummy function pointer type instead. The question is: can you reliably store a method pointer into this dummy function pointer type? Or what can you use instead of a void pointer to store a method pointer? 🤔
@cppweekly
@cppweekly 2 ай бұрын
I've never heard of this recommendation to use a dummy function pointer type instead. I just double checked and cast to and from void * is allowed as of C++11. The problem is that it's less portable - on something like harvard architecture you might actually have different sized pointers between function pointer and memory pointer types... I would personally still use void * when I need to do that. So, I have no specifically clear guidance here.
@EgorChebotarev
@EgorChebotarev Ай бұрын
cool
@pierrecolin6376
@pierrecolin6376 2 ай бұрын
Is std::launder necessary for unions? I know the answer is no when the data members are active only once in the lifetime of the union, but this is not the case here.
@cppweekly
@cppweekly 2 ай бұрын
I believe launder is only necessary in the case of something like placement new(), en.cppreference.com/w/cpp/utility/launder there's no mention of unions there.
C++ Weekly - Ep 431 - CTAD for NTTP
7:34
C++ Weekly With Jason Turner
Рет қаралды 8 М.
C++ Weekly - Ep 312 - Stop Using `constexpr` (And Use This Instead!)
18:24
C++ Weekly With Jason Turner
Рет қаралды 51 М.
How I Did The SELF BENDING Spoon 😱🥄 #shorts
00:19
Wian
Рет қаралды 36 МЛН
Harley Quinn's desire to win!!!#Harley Quinn #joker
00:24
Harley Quinn with the Joker
Рет қаралды 16 МЛН
UNO!
00:18
БРУНО
Рет қаралды 4,9 МЛН
Schoolboy Runaway в реальной жизни🤣@onLI_gAmeS
00:31
МишАня
Рет қаралды 2,8 МЛН
C++ Weekly - Ep 404 - How (and Why) To Write Code That Avoids std::move
8:50
C++ Weekly With Jason Turner
Рет қаралды 28 М.
C++ Weekly - Ep 313 - The `constexpr` Problem That Took Me 5 Years To Fix!
26:19
C++ Weekly With Jason Turner
Рет қаралды 23 М.
The Pointer to Implementation (pImpl) idiom in C++
6:54
platis.solutions
Рет қаралды 14 М.
C++ Weekly - Ep 435 - Easy GPU Programming With AdaptiveCpp (68x Faster!)
15:30
C++ Weekly With Jason Turner
Рет қаралды 14 М.
Compilers, How They Work, And Writing Them From Scratch
23:53
Adam McDaniel
Рет қаралды 153 М.
Cursed C++ Casts
17:41
Logan Smith
Рет қаралды 72 М.
WHY did this C++ code FAIL?
38:10
The Cherno
Рет қаралды 250 М.
C++ Weekly - Ep 305 - Stop Using `using namespace`
14:29
C++ Weekly With Jason Turner
Рет қаралды 19 М.
The cloud is over-engineered and overpriced (no music)
14:39
Tom Delalande
Рет қаралды 553 М.
How I Did The SELF BENDING Spoon 😱🥄 #shorts
00:19
Wian
Рет қаралды 36 МЛН