Changing std::sort at Google’s Scale and Beyond

TL;DR; We are changing std::sort in LLVM’s libcxx. That’s a long story of what it took us to get there and all possible consequences, bugs you might encounter with examples from open source. We provide some benchmarks, perspective, why we did this in the first place and what it cost us with exciting ideas fromContinue reading “Changing std::sort at Google’s Scale and Beyond”

Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator

TCMalloc team recently published a paper on OSDI’21 about Google’s allocator internals, specifically on how huge pages are used. You can read the full paper here. TL;DR. Google saved 2.4% of the memory fleet and increased the QPS performance of the most critical applications by 7.7%, an impressive result worth noting. Code is open sourced,Continue reading “Beyond malloc efficiency to fleet efficiency: a hugepage-aware memory allocator”

How a Bug(?) in the Linux CRC-32 Checksum Turned out not to be a Bug

A friend of mine nerd sniped me with a good story to investigate which kinda made me giggle and want to tell to the public some cool things happened to me during my experience with hashing and other stuff. The story is simple and complex at the same time, I’d try to stick to aContinue reading “How a Bug(?) in the Linux CRC-32 Checksum Turned out not to be a Bug”

Miniselect: Practical and Generic Selection Algorithms

Today I present a big effort from my side to publish miniselect — generic C++ library to support multiple selection and partial sorting algorithms. It is already used in ClickHouse with huge performance benefits. Exact benchmarks and results will be later in this post and now let’s tell some stories about how it all arose.Continue reading “Miniselect: Practical and Generic Selection Algorithms”

I need extra C/C++ performance now. How?

Hello everyone, today we are going to talk about C++ performance. Again, another “usual” perf blog as you may think. I promise to give really practical advice and don’t get bullshitty (okay, a little bit) talks about how C++ is performant. We are gonna roast the compilers and think about what we should do aboutContinue reading “I need extra C/C++ performance now. How?”

News Aggregator from Scratch in 2 Weeks

Data Clustering Contest was a two-week data science and engineering competition held by Telegram in May. The Telegram team wanted participants to build a simple yet effective news aggregator that consolidates thousands of articles from various publishers and websites into a single page, which shows the latest news in real time, exactly like Google News, Bing News orContinue reading “News Aggregator from Scratch in 2 Weeks”

Optimizing 128-bit Division

When it comes to hashing, sometimes 64 bit is not enough, for example, because of birthday paradox — the hacker can iterate through random entities and it can be proven that with some constant probability they will find a collision, i.e. two different objects will have the same hash. is around 4 billion objects andContinue reading “Optimizing 128-bit Division”

How to crack HashCode competition with some engineering skills

Hi everyone, hashcode is over. As now I can tell anything, I want just to tell how I was solving this problem, my methodologies, overall the structure of the data sets. I am not going to provide any internals and whatsoever, all data can be checked and much of it is my personal opinion thoughContinue reading “How to crack HashCode competition with some engineering skills”

I wrote Go code for 3 weeks and you won’t believe what happened next

Historically, I am a C++ programmer and really like all this kind of C-ish stuff — from raw assembly to high level abstractions and SFINAE methods. For the past three weeks I managed to write lots of Go code — around 10k LOC without almost knowing anything when I started. I want to talk whatContinue reading “I wrote Go code for 3 weeks and you won’t believe what happened next”