Wake up! It’s HighScalability time:

Do you like this sort of Stuff? Your support on Patreon is appreciated more than you can know. I also wrote Explain the Cloud Like I’m 10 for everyone who needs to understand the cloud (which is everyone). On Amazon it has 90 mostly 5 star reviews (152 on Goodreads). Please recommend it. You’ll be a real cloud hero.

Number Stuff:

  • 95.26%: uptime for a solar powered website. One kilowatt-hour of solar generated electricity can serve almost 50,000 unique visitors. 
  • $4.7 billion: YouTube Q4 revenue. $2.6 billion cloud. On a yearly basis YouTube is up 36 percent from 2018 and 86 percent compared to 2017. Scheduling an ad every 5 seconds is profitable. Nick Statt: 20% the size of Facebook’s; contributes 10% of all Google revenue; 6x bigger than Twitch; about 20 percent the size of entire US TV ad spent 
  • 25%: less energy used by a smart heating and cooling control system.
  • 10x: increase in WiFi signal strength and 2x median channel capacity using a smart wall surface that “can work both as a mirror or a lens” to focus radio signals onto the right devices on either side of the “fence.”
  • 150 million: Amazon Prime members. Remember, Costco makes their money on membership dues, not goods. 
  • 2 exabytes: space needed for complete wiring diagram for a mouse brain. 
  • 11%: increase in commerce when eBay improved its translation function. 
  • 175: Amazon retail fullfillment centers worldwide with over 250,000 full-time associates shipping millions of items per day.
  • 62%: Microsoft’s Azure increase in revenue year-on-year. Its Dynamics 365 service – cloud-based enterprise resource planning – jumped 42 per cent.
  • $2 billion: smart building market revenue by 2026 for software and services.
  • 143: New Geoglyphs Discovered on the Nasca Pampa 
  • 218 million: Snapchat users, up 3.8%.  Snap lost $241 million on 560.8 million in revenue that’s up 44% year-over-year and an EPS of $0.03. 
  • 3,000: military maps collected by King George III.
  • 1+ billion: hours of video watched on YouTube every day. 
  • 4: satellites needed to provide continuous global coverage for a fraction of the cost.
  • 31 billion: items stored by Duolingo to deliver lessons in 80+ languages. 24,000 read units per second and 3,300 write units per second. 18 mliiion monthly active users. 6 billion exercises per month. 2 people in devops.
  • 2 billion: daily swipes at Tinder and hosts more than 30 billion matches to 190 countries globally.
  • 12%: reduction in semiconductor industry revenue for 2019. Memory revenue dropped 33%. 
  • $10 billion: Google Cloud run rate. It grew by 53.6% in the last year. 
  • 90%: of all Bluetooth devices will include Bluetooth Low Energy by 2023.
  • 30%: Chrome is slower.

Quotable Stuff:

  • Jeffrey Paul: Connectivity will be the great equalizer in the future. With high-quality, uncensored, reliable access to the global network, intelligent and resourceful people located virtually anywhere can operate on substantially similar footing to anyone else so equipped. Billions of people, blocked from accessing the Internet due to lack of infrastructure or local greed or fraud related to same, are presently kept from participating in the global knowledge economy. Starlink will remedy this, to some extent. Save for the wide deployment of the Internet itself, Starlink and its spiritual siblings launched by others may be one of the crowning technological achievements of our generation. Preliminary tests with fewer than 100 satellites up showed approximately 600Mbps available as tested on an aircraft in flight to Starlink. 
  • Werner Vogels: There’s lots of things I would have done differently [at AWS]. For example, in the beginning, we combined identity and account. Those are two different things. Identity is a security component, and account is what you bill things to. If we had been smarter, we would have separated them, as we eventually did.
  • Mark Lapedus: Today’s microelectronics are organized with 80% of things in the cloud and 20% on the edge. In five years from now, it will be reversed. It will be 80% on the edge and only 20% in the cloud. There is some rational behind this, telling us that it will go in this direction. It is a question of the privacy of data.
  • @lethain: One morning as the golden rays of sunlight drifted into the rusting windows of our [Digg] Potrero office, our fortunes started to change. Facebook activity spiked and for one glorious day we were the top ranking newsfeed application on Facebook. A day later our haggard data pipeline plopped the day’s analytics into Hive, and our two data scientists, Alan and Tara, ran the analysis – what was the nature of our salvation? Our hero? A piece of malware pretending to be a Justin Bieber-Selena Gomez sex tape. Our future? Selling the business, the patents, and most valuably the team to the highest bidder.
  • @kvlly: Protip: If you’re stuck on a coding issue, sleep on it. That way when you wake up and try to fix it and it’s still broken, at least you got some sleep.
  • @joe_hellerstein: 1/3 This deserves a longer discussion but (1) step functions are very slow, see paper. 2/3 (2) invoking multiple fns is so slow you can’t contemplate interesting logic cast into multiple fns the way you would in a standard programming language, so give up on pure stateless functional programming in that regime. 3/3 (3) as a result I question whether current wisdom on patterns vs antipatterns for serverless is sound. Perhaps constrained by 1st gen FaaS limitations, which basically limit programming to coarse-grained workflow. Substantially undercuts the potential of the platform.
  • @eturner303: Wow: Google’s “Meena” chatbot was trained on a full TPUv3 pod (2048 TPU cores) for **30 full days** – That’s more than $1,400,000 of compute time to train this chatbot model. (! 100+ petaflops of sustained compute !)
  • Matthew Skelton: Co-design the organisation and the system architecture.
  • Geoff Roberts: The lesson here is simple—don’t be distracted by new or adjacent markets until you’re truly winning and dominating in your own. You don’t need a huge TAM to build a big company
  • Scott Aaronson: Later, Krishna explained why quantum computers will never replace classical computers: because if you stored your bank balance on a quantum computer, one day you’d have $1, the next day $1000, the day after that $1 again, and so forth! He explained how, where current supercomputers use the same amount of energy needed to power all of Davos to train machine learning models, quantum computers would use less than the energy needed to power a single house. New algorithms do need to be designed to run neural networks quantumly, but fortunately that’s all being done as we speak.
  • Michael Letko: This is one of the first times we’re getting to see an outbreak of a new virus and have the scientific community sharing their data almost in real time, rather than have to go through classic route of going through the journals
  • Roger Lee: My advice to marketplace companies, considering all this, is to pay attention to unit economics—not just growth. Growth without attractive unit economics is just not sustainable; it’s like gorging on empty calories. So, what exactly are attractive unit economics in this sector? I see three key metrics to track. First, have a plan to break even on paid acquisition within 12 months. Second, maintain an “LTV to CAC” ratio, which means lifetime customer value compared to customer-acquisition costs, of 3X or more within 2-3 years of customer acquisition. Finally, you should also strive to comply with the so-called “Rule of 40”—the idea that a company’s growth rate plus EBITDA margin should equal or exceed 40%. Younger companies can hit this benchmark by growing very rapidly (e.g., 100% annual growth with -60% EBITDA margins). But as they mature and growth slows, they need to focus on efficiency (and eventually profitability) to keep up (e.g., 40% revenue growth with 0% EBITDA margins, 20% growth with 20% EBITDA margins, etc.)
  • @etherealmind: Context: Google spent 0.000040625% of gross 2019 revenue on a bug bounty program. The avg Google salary employee costs ~240K (120K salary doubled) or roughly 27 FTE roles. So it was cheap.
  • @dhh: Now that our industry is finally recovering from the mass delusion that microservices was going to be the future, it’s surely time to for the even bigger delusion that serverless is what’s going to provide the all-purpose salvation 🙄😂
  • @Ned1313: In seven years of consulting, I saw no major evidence of this. Almost all of my projects were moving things into the cloud or creating new things in the cloud. I started out working exclusively on projects to install on-premises hardware, and quickly moved to exclusively cloud.
  • Alexandra Mousavizadeh: The US is the undisputed leader in AI development, the Index shows. The western superpower scored almost twice as highly as second-placed China, thanks to the quality of its research, talent and private funding. America was ahead on the majority of key metrics – and by a significant margin. However, on current growth experts predict China will overtake the US in just five to 10 years.
  • Dan Luu: Reaching 95%-ile isn’t very impressive because it’s not that hard to do. I think this is one of my most ridiculable ideas. It doesn’t help that, when stated nakedly, that sounds elitist. But I think it’s just the opposite: most people can become (relatively) good at most things.
  • @kellabyte: The only highly scalable system I’ve ever witnessed myself that was thousands of requests per second was written in Ruby and it had something like 2,000 instances.
  • Alexander Krylatov: [Transport Engineers] do not have competencies in the field of system-related increases in traffic performance. If engineers manage to achieve local improvements, after a while the flows rearrange and the same traffic jams appear in other places.
  • Jonathan Blow: And so Rust has a good set of ingredients there. The problem that I have with it is when I’m working on really hard stuff, I don’t exactly know what I’m doing for a long time. And so if the cost of experimentation is driven too high, it actually impairs my ability to get work done. 
  • Benedict Evans: Software ate the world, so all the world’s problems get expressed in software. We connected everyone, including the bad people.
  • @_joemag: Our industry continues to hurt from the fact that one year is single worst length of time for certificate expirations. Too short to be rare and ignorable, and too long to build an operational muscle around renewing them.
  • Paul Marks: For the future, however, the banking and finance industry’s move to voice-based account management services—following on the success of voice assistants like Amazon’s Alexa, Google Home, and Apple’s Siri—may end up making them vulnerable to hack attacks via deepfake audio. That might add a voiceprint component to digital doppelgängers, with heavy ramifications for services.
  • Christian Sandström: [Clayton Christensen] argued that companies were being misled by the very same practices—such as listening to their customers, or designing next-generation products for existing users—that had made them successful in the first place. Firms performed well by adhering to the needs of key actors in the environment, but over time, the environment started to impose a great indirect control over firms, eventually putting them in deep trouble. The theory was beautifully counterintuitive.
  • Ivo Bolsens: In the future you will see more FPGA nodes than CPU nodes. The ratio might be something like one CPU to 16 FPGAs, acceleration will outweigh general compute in the CPU.
  • MIT Tech Review: In 2019, the most significant investment or priority for surveyed companies’ technology strategy was data architecture. Nearly 80% of respondents picked data architecture as a top three priority investment last year. This paves the way for data analytics and AI which will become much greater priorities in 2020, jumping from 63% to 91%. Investment in digitizing products and services was the number two priority in 2019, but slides behind the Internet of Things (IoT), cybersecurity, and even blockchain technology as a priority for 2020.
  • YawningAngel: I’m an enterprise GCP customer, Google Support have a few superbly irritating habits: 1. They link to generic documentation that doesn’t solve my problem 2. They insist that things that are clearly bugs aren’t bugs until they’re provided with some trivial reproduction case that satisfies them 3. They refuse to advise on issues with beta products despite half of GCP’s products being in a beta 4. They are sometimes just flat-out wrong (but confidently so) about the cause of an issue. Give me AWS support any day
  • @sandy_carter: The U.S. Navy is moving the largest ERP system — 72,000 users across 6 commands — to the #AWS Cloud. The milestone came 10 months ahead of schedule.
  • @ajaynairthinks: Sunday musings as I write an exec doc – It’s critical for those us building “way of life” tech to understand the human factors of our offering. For example, whatever you think of Kubernetes as a technology, it has a very powerful force working in its favor : Hope.
  • Rehan van der Merwe: This blog will demonstrate the high throughput rate that DynamoDB can handle by writing 1 million records in 60 seconds with a single Lambda, that is approximately 17k writes per second. This is all done with less than 250 lines of code and less than 70 lines of CloudFormation. We will also show how to reach 40k writes per second (2.4 million per minute) by running a few of the importer Lambdas concurrently to observe the DynamoDB burst capacity in action.
  • @rbranson: An interesting trend that GDPR/CCPA is driving within engineering teams is that data retention is being considered at the onset of projects pretty consistently. It’s a pretty easy box to check for most systems, particularly when built with finite retention from the start.
  • Paul Teich: Google may have opted to design more intelligence into its routers (see more about the Andromeda network controller and the Click modular router to dive into the architecture), relegating its Intel NICs to use Intel QuickData Technology’s DMA and to offload network encryption and decryption from server processors. If GCP has deployed SmartNICs, the search engine giant and cloud provider has stayed completely silent about it.
  • Backblaze: What we’re seeing in our fleet right now is a higher-than-typical failure rate among some of our 12TB Seagate drives. I
  • Diego Pacheco: Fastify did 6K RPS (Request per Second). Netty did 23K RPS (Request Per Second). Actix did 53K RPS (Request Per Second). Actix 1ms latency. Netty 4ms latency. Fastify 14ms latency
  • Sabine Hossenfelder: To start, I need to say what undecidability and uncomputability are in the first place. The concepts go back to the work of Alan Turing who in 1936 showed that no algorithm exists that will take as input a computer program (and its input data), and output 0 if the program halts and 1 if the program does not halt. This “Halting Problem” is therefore undecidable by algorithm. So, a key way to know whether a problem is algorithmically undecidable – or equivalently uncomputable – is to see if the problem is equivalent to the Halting Problem.
  • Ayende Rahien: For certain type of applications, there is a hard cap of what load you can be expected to handle. And you should absolutely take advantage of this. The more stuff you can not do, the better you are. And if you can make reasonable assumptions about your load, you don’t need to go crazy. Simpler architecture means faster time to market, meaning that you can actually deliver value, rather than trying to prepare for the Babies’ Apocalypse.
  • Roger Dooley: As mints, early Life Savers were unremarkable except for a memorable brand based on their proprietary hole in the center. Other mints no doubt tasted about the same. The key to the explosive growth of Life Savers was convenience: customers could make the purchase without thinking. Noble was smart enough to place the displays not in candy stores but in the exact places where people might want a mint – when paying at a tobacco store, saloon, or restaurant. And, the “nickel change” practice ensured near-zero thought and effort to make the sale.
  • anderskaseorg: The problem with using a classical computer to solve an NP-hard problem is that as the problem size increases, in the worst case, as far as we know, the running time increases exponentially. The problem with using a photonic computer to solve an NP-hard problem is that as the problem size increases, the amount of light at the output node drops exponentially. Even under the assumption that you can get rid of all background noise, that means you need to run your detector for longer so as to be able to detect the output photons, and…the running time increases exponentially.
  • Wesley Aptekar-Cassels: Writing non-trivial software that is correct (for any meaningful definition of correct) is beyond the current capabilities of the human species. Being aligned with teammates on what you’re building is more important than building the right thing. Peak productivity for most software engineers happens closer to 2 hours a day of work than 8 hours. Most measures of success are almost entirely uncorrelated with merit. Thinking about things is a massively valuable and underutilized skill. Most people are trained to not apply this skill. The fact that current testing practices are considered “effective” is an indictment of the incredibly low standards of the software industry. How kind your teammates are has a larger impact on your effectiveness than the programming language you use. 
  • Murat: Liu Cixin has a programming background. He should have realized that this trust problem is better handled by using a Byzantine fault-tolerant consensus protocol. Instead of one person, choose 7 people to act as swordholders and this system will tolerate 2 Byzantine swordholders (because N>3*F).
  • Traveloka: One of the advantages of serverless architecture is that it costs less compared to renting a dedicated server. In our case, we managed to trim down our spending by more than 90% compared to our existing cost with the EC2 server. Another advantage is that we don’t have to maintain our server since AWS is doing it for us. As a result, the latency to load front-end content is also greatly reduced.
  • Sabine Hossenfelder: When I speak about a minimal length, a lot of people seem to have a particular image in mind, which is that the minimal length works like a kind of discretization, a pixilation of an photo or something like that. But that is most definitely the wrong image. The minimal length that we are talking about here is more like an unavoidable blur on an image, some kind of fundamental fuzziness that nature has. It may, but does not necessarily come with a discretization. What does this all mean? Well, it means that we might be close to finding a final theory, one that describes nature at its most fundamental level and there is nothing more beyond that.
  • Jianfeng Gao: Deep learning, in some sense, is to map all the knowledge from the symbolic space to neural space because in the neural space, all the concepts are represented using continuous vectors. It’s a continuous space. It has a lot of very nice mass properties. It’s very easy to train. That’s why, if you have a large amount of data and you want to train a highly non-linear function, it’s much easier to do so in the neural space than in the symbolic space, but the disadvantage of the neural space is it’s not human comprehensible. 
  • @cocoaphony: Periodic Reminder: When debugging, you must first accept that something you believe is true is not true. If everything you believed about this system were true, it would work. It doesn’t, so you’re wrong about something. This is a surprisingly common stumbling block for devs.
  • Scott Aaronson: And I replied: I’m flattered by your surely inflated praise, but in truth I should also thank you. You caught me at a moment when I’d been thinking to myself that, if only I could make one or two people’s eyes light up with comprehension about the fallacy of a QC simply trying all possible answers in parallel and then magically picking the best one, or about the central role of amplitudes and interference, or about the “merely” quadratic nature of the Grover speedup, or about the specialized nature of the most dramatic known applications for QCs, or about the gap between where the experimentalists are now and what’s needed for error correction and hence true scalability, or about the fact that “quantum supremacy” is obviously not a sufficient condition for a QC to be useful, but it’s equally obviously a necessary condition, or about the fact that doing something “practical” with a QC is of very little interest unless the task in question is actually harder for classical computers, which is a question of great subtlety … I say, if I could make only two or four eyes light up with comprehension of these things, then on that basis alone I could declare that the whole trip to Davos was worth it.
  • pcr910303: Unison is a functional language that treats a codebase as an content addressable database[2] where every ‘content’ is an definition. In Unison, the ‘codebase’ is a somewhat abstract concept (unlike other languages where a codebase is a set of files) where you can inject definitions, somewhat similar to a Lisp image. One can think of a program as a graph where every node is a definition and a definition’s content can refer to other definitions. Unison content-addresses each node and aliases the address to a human-readable name. This means you can replace a name with another definition, and since Unison knows the node a human-readable name is aliased to, you can exactly find every name’s use and replace them to another node. In practice I think this means very easy refactoring unlike today’s programming languages where it’s hard to find every use of an identifier.
  • otterley: (I work for AWS. Opinions are my own and not necessarily those of my employer.) I’ve been doing some initial M6g tests in my lab, and while I’m not able to disclose benchmarks, I can say that my real-world experience so far reflects what’s been claimed elsewhere. Graviton2 is going to be a game changer. It’s not like the usual experience with ARM where you have to trade off performance for price, and decide whether migrating is worth the recompilation effort. In my lab, performance of the workloads I’ve tried so far is uniformly better than on the equivalent M5 configuration running on the Intel processor. You’re not sacrificing anything by running on Graviton2. If your workloads are based on scripting languages, Java, or Go, or you can recompile your C/C++ code, you’re going to want to use these instances if you can. The pricing is going to make it irresistible. Basically, unless you’re running COTS (commercial off-the-shelf software), it’s a no-brainer.
  • Mark Lapedus:  A chip consists of three parts—transistor, contacts and interconnects. The transistor serves as the switch in a device. Advanced chips have as many as 35 billion transistors. The interconnects, which reside on the top of the transistor, consist of tiny copper wiring schemes that transfer electrical signals from one transistor to another. The transistor and interconnect are connected by a layer called the middle-of-line (MOL). The MOL consists of tiny contact structures. IC scaling, the traditional way of advancing a design, shrinks the transistor specs at each process node and packs them onto a monolithic die…Scaling is also slowing at advanced nodes. Generally, a 7nm foundry process consists of a contacted poly pitch (CPP) ranging from 56nm-57nm with a 40nm metal pitch, according to IC Knowledge and TEL. At 5nm, the CPP is roughly 45nm-50nm with a 26nm metal pitch. CPP, a key transistor metric, measures the distance between a source and drain contact.

Useful Stuff:

  • How I write backends. Kind of a hybrid cloud plus old school approach, definitely not cloud native, but there’s a lot of solid advice, especially for those who want avoid managed services. By old school I mean you won’t find docker, containers, ansible, github, or anything like that. It’s just you, the code, and the machines. It does mean there’s a lot more for you to do. Fortunately there are a lot of good low level details on how to deail with issues like environment variables, notifications, code structure, documentation, provisioning, db partitioning, deployment, identity and security, etc. 
    • My approach to backends (as with code in general) is to iteratively strive for simplicity. This approach – and a share of good luck – has enabled me to write lean, solid and maintainable server code with a relatively small time investment and minimal hardware resources. Part of the reason I put all of this out there is to offer a different point of view that has worked for me and others I’ve worked with, and to stimulate debate and fact-based (or at least experience-based) interactions regarding backend lore, instead of pining for best practices that often are under-scrutinized.
    • Tools: Ubuntu Long Term Support version, Node.js Long Term Support version, Redis, NginX, S3, SES, EC2. 
    • Different styles of architectures: they are labeled A – E and range from a local development environment to a high availability horizontally scalable load balanced system.
    • On microservices: Before you rush to embrace the microservices paradigm, I offer you the following rule of thumb: if two pieces of information are dependent on each other, they should belong to a single server. In other words, the natural boundaries for a service should be the natural boundaries of its data.
  • Stateful functions is the next evolution of serverless. Cloudburst is an example of how the might work. Clouburst is a faster platform for general-purpose serverless computing. Initial results show that we can outperform standard FaaS architectures using systems like AWS Lambda, AWS DynamoDB, and Redis by orders of magnitude. It uses new design principle called logical disaggregation with physical colocation (LDPC).  LDPC requires multi-master data replication to keep “hot” data physically nearby for low-latency access.
    • There’s a paper, article, and code on github. Cloudburst: Stateful Functions-as-a-Service (github, article): Function-as-a-Service (FaaS) platforms and “serverless” cloud computing are becoming increasingly popular. Current FaaS offerings are targeted at stateless functions that do minimal I/O and communication. We argue that the benefits of serverless computing can be extended to a broader range of applications and algorithms. We present the design and implementation of Cloudburst, a stateful FaaS platform that provides familiar Python programming with low-latency mutable state and communication, while maintaining the autoscaling benefits of serverless computing. Cloudburst accomplishes this by leveraging Anna, an autoscaling key-value store, for state sharing and overlay routing combined with mutable caches co-located with function executors for data locality. Performant cache consistency emerges as a key challenge in this architecture. To this end, Cloudburst provides a combination of lattice-encapsulated state and new definitions and protocols for distributed session consistency. Empirical results on benchmarks and diverse applications show that Cloudburst makes stateful functions practical, reducing the state-management overheads of current FaaS platforms by orders of magnitude while also improving the state of the art in serverless consistency.
  • Performance testing HTTP/1.1 vs HTTP/2 vs HTTP/2 + Server Push for REST APIs
    • If speed is the overriding requirement, keep using compound documents.
    • If a simpler, elegant API is the most important, having smaller-scoped, many endpoints is definitely viable.
    • Caching only makes a bit of difference.
    • Optimizations benefit the server more than the client.
  • Benedict Evans shared his yearly tech trends talk at Davos. Tech in 2020: Standing on the shoulders of giants. The history of computing technology has been a series of S-curves, with each S-curve spanning a period of about 15 years. What’s the next big thing? The answer is divided into different categories. Frontier tech: quantum computing, new battery chemistry, neural interfaces, autonomy, AR options. Important but narrow: Drones, IoT, Voice, Wearables, Robotics, eSports, 3D printing, VR, Micro-satellites. Structural learning: ML, crypto?, 3G/4g/5g, cloud. The next platform? AR glasses?
  • GoDays 20 videos are now available. You might like One Decade of Go or Building a multiplayer game server in Go and Webassembly.
  • Want to do great work? 
    • You need a diverse set of interests. So don’t think you need to just one thing for the rest of your life. That might actually hurt you. Tim Harford: Well, a pattern that emerged was clear, and I think to some people surprising. The top scientists kept changing the subject. They would shift topics repeatedly during their first 100 published research papers. Do you want to guess how often? Three times? Five times? No. On average, the most enduringly creative scientists switched topics 43 times in their first 100 research papers. Seems that the secret to creativity is multitasking in slow motion. Eiduson’s research suggests we need to reclaim multitasking and remind ourselves how powerful it can be. And she’s not the only person to have found this. Different researchers, using different methods to study different highly creative people have found that very often they have multiple projects in progress at the same time, and they’re also far more likely than most of us to have serious hobbies. Slow-motion multitasking among creative people is ubiquitous. 
    • You need to procrastinate—at least according to the Originals. It’s called the Zeigarnik effect: “activity that has been interrupted may be more readily recalled. It postulates that people remember uncompleted or interrupted tasks better than completed tasks.” We have a better memory for incomplete tasks which means they stay in  our memory longer so we can keep thinking about them. Once we check an item off a list we forget it. That’s how procrastinatation makes you more creative. You’ll think of something over time and come up with new ideas. It helps with vujaday—looking at something you see all the time through fresh eyes. But you eventually have to finish something. So enter quick mode to generate ideas; then enter slow down so you get access to new insight; then toggle back to productivity mode.
  • Do you allocate memory in thread local and a lot of threads? If you create a lattice structure in your memory allocations then you may experience really long GC pauses. That must have been a tough one to find. The slow slowdown of large systems
  • There’s a new kind of memory in town. University of Lancaster Invents Yet Another Memory: The Memory Guy recently encountered some stories in the press about “UltraRAM” which is the name for a new type of NVRAM developed by researchers at Lancaster University in the UK. According to the papers, the new memory exploits the quantum properties of a triple-barrier Resonant Tunneling (RT) structure to produce a nonvolatile memory that can be either read or written with low-voltages.   Not only can the Lancaster team’s approach eliminate flash wear, but it also promises to allow DRAMs to become nonvolatile, thereby eliminating two energy-wasting characteristics that afflict DRAM technology: Refresh and Destructive Read. But the new technology suffers from the same issue that a lot of emerging memories do – it uses chemical elements that are not already found in a standard CMOS process, so the transition from DRAM’s standard CMOS to something with III-V will motivate DRAM makers to put off using it as long as possible. 
  • Why Discord is switching from Go to Rust
    • As usual with a garbage collected language the problem was CPU stalls due to garbage collection spikes. But it non-GC languages you have to worry about memory fragmentation, especially for long lived processes. When you get that sev 1 bug that happens after two months of flawless execution it will often be a memory allocation failure due to memory fragmentation. So you end up creating your own memory allocator anyway. 
    • But there are other advantages…
    • When we started load testing, we were instantly pleased with the results. The latency of the Rust version was just as good as Go’s and had no latency spikes! Remarkably, we had only put very basic thought into optimization as the Rust version was written. Even with just basic optimization, Rust was able to outperform the hyper hand-tuned Go version. 
    • After a bit of profiling and performance optimizations, we were able to beat Go on every single performance metric. Latency, CPU, and memory were all better in the Rust version.
    • another great thing about Rust is that it has a quickly evolving ecosystem. 
    • Along with performance, Rust has many advantages for an engineering team. For example, its type safety and borrow checker make it very easy to refactor code as product requirements change or new learnings about the language are discovered. Also, the ecosystem and tooling are excellent and have a significant amount of momentum behind them.
    • Also, Our business case for using Go – it’s all about saving money.
  • Videos from ZooKeeper [email protected]: Advancing the state of distributed coordination are now available. You might like [email protected]: Scalability
  • Amazon retail needed a data lake to be the gravitational attractor for all their massive and disperate data sets. So what did AWS do? They made a service out of it of course! AWS Lake Formation. It uses all the services you might imagine, but the interesting point is the co-evolving nature of Amazon reatil and AWS. They inform and help each other. And since a lot of people need the same kind of stuff Amazon retail needs there’s a very virtuous circle going on.
  • Competition is a great thing. 
    • AMD Threadripper 3990X 64-Core Beast Seen Crushing $20K Of Xeon Platinum CPUs In Benchmark Leak: As you may already know, the Threadripper 3990X is a beastly chip. Popping open the hood reveals 64 physical cores and 128 threads of high-end desktop (HEDT) muscle, with a 2.9GHz base clock, 4.3GHz boost clock, and 256MB of L3 cache (along with 32MB of L2 cache and 4MB of L1 cache). It all comes wrapped in a high-octane 280W package. As shown in the database entry above, the Threadripper 3990X achieved a Processor Arithmetic score of 1,786.22 GOPS while running at 4.35GHz. That’s a touch above what AMD rates the maximum boost clock, so it is possible the CPU was overclocked for this benchmark run (assuming the frequency is accurate).To put that score into perspective, it is nearly 18 percent higher than a system with two second generation Intel Xeon Scalable Platinum 8280 processors (Cascade Lake).
    • semiaccurate: That brings us to today with the new Cascade Lake-R (-R is for ‘Refresh) line of CPUs, 14 SKUs in total. With them Intel is making the whisper numbers official and gutting prices. When SemiAccurate says gutting, we aren’t talking 5, 10, or even 15%, we are talking a rollback. This is not a fight anymore, it is Intel in desperate retreat. More interesting is what effect these changes will have on capacity constraints and margins, the changes are far deeper than they look on paper if you get the underlying tech.
  • Often contrarians are on the right path. And if you’re a contrarian against Microservices you have to come up with what you are for. DHH has a name for that: integrated systems
    • Integrated systems for integrated programmers: Microservices as an architectural gold rush appealed to developers for the same reason TDD appeals to developers: it’s the pseudoscientific promise of a diet. The absolution of a new paradigm to wash away and forgive our sins. Who doesn’t want that? Well, maybe you? Now after you’ve walked through the intellectual desert of a microservice approach to a problem that didn’t remotely warrant it (ie, almost all of them). Maybe now you’re ready to hear a different story. There’s a slot in your brain for a counterargument that just wasn’t there before. So here’s the counterargument: Integrated systems are good. Integrated developers are good. Being able to wrap your mind around the whole application, and have developers who are able to make whole features, is good! The road to madness and despair lays in specialization and compartmentalization.
    • Now does this same idea apply to microfrontends? (let’s just skip to the end and make it one word). 
  • DataDog examined data from thousands of companies to characterize their Serverless usage. Here’s what they found in The State of Serverless: Half of AWS users have adopted Lambda; Lambda is more prevalent in large environments; Container users have flocked to Lambda; Amazon SQS and DynamoDB pair well with Lambda; Node.js and Python dominate among Lambda users; The median Lambda function runs for 800 milliseconds; One fifth of Lambda functions runs for 100 ms or less; Half of Lambda functions have the minimum memory allocation; Two thirds of defined timeouts are under 1 minute; Only 4% of functions have a defined concurrency limit.
  • Are you’re afraid of the gray goo scenario for the end of the world? Don’t be. The Issues We Face at the Nano Scale. Nanomachines need a really controlled environment to live. They deed the right PH, salt, and temperature. Change any parameters and nanomachines won’t work. World saved.
  • The corporation wars have begun they have. Why American Farmers Are Hacking Their Tractors With Ukrainian Firmware. Open source can help here. Nobody needs to be a John Deere tractor from a company that has obviously been hijacked by HBS graduates. There’s an open source tractor. Though it is nice to see farmers join the revolution once again. 
  • Lots to learn. Multi-Version Concurrency Control [Design Decisions] (CMU Databases / Spring 2020)(slides): MVCC is the best approach for supporting txns in mixed workloads.
  • Nice set of resources on Azure security. MarkSimos/MicrosoftSecurity. Also, Remote Cloud Execution – Critical Vulnerabilities in Azure Cloud Infrastructure (Part I)
  • If you are making the big design decision of Fargate vs Lambda you’ll want to read this. One note, if security is important you still can’t be API Gateway.  Fargate vs Lambda
    • Despite the price reduction of HTTP APIs, unless your traffic is extremely spiky, Fargate is a clear winner when it comes to APIs.  It is important to call out that API Gateway does more than an ALB though, providing rate limiting (though not on HTTP APIs yet) and authorization
    • This scenario [processing messages from a queue] makes Lambda much more competitive, even when using spot pricing, which is usually a viable option for async processing tasks. For spiky loads, and low to moderate traffic, Lambda still comes out well here.
    • When it comes to creating new applications, SAM makes the process incredibly easy for Lambda based architectures. Fargate, on the other hand, requires far more work to get going with. 
    • Lambda particularly shines in two areas – scale to zero, and rapid scaling. The fact that you don’t have to pay for idle applications is very useful for low traffic workloads and dev environments.
    • Though Fargate did end up being faster, I think the more important quality here is the consistency of the response times. But the meaning of these numbers is entirely dependent on the use case. There are a lot of use cases where the increased latency and inconsistency are not a problem, especially when the APIs are doing enough work that the additional 100ms overhead isn’t as noticeable. For more time sensitive or critical APIs, such as ones that are offered as a paid service, it is more important to offer a fast and consistent experience. For asynchronous, queue-consuming tasks, performance is less of an issue. Outside of edge cases, either platform will do the job admirably.
    • Despite the recent improvements, there are still areas where Lambda and Fargate do not compete due to technical limitations. Things like long-running tasks or background processing aren’t as doable in Lambda. Some event sources, such as DynamoDB streams, cannot be directly processed by Fargate.
  • Want to become a quantum mechanic? A Quantum Computation Course.  
  • I can’t wait for the post modern apps. What will be our reaction to modernism? What are Modern Applications?: Modern Apps are packaged to run on platforms; and the packaging piece is very important – it expresses a way by which the application and its dependencies are brought together, and it expresses a way by which the application and its dependencies can be updated over time to deal with functional improvements, security vulnerabilities etc. It seems that the industry has settled on two directions for platforms – cloud provider native (and/)or Kubernetes:
  • Modernizing the internet with HTTP/3 and QUIC. Faster handshakes. Most of the packet header is encrypted. The most interesting new feature is connection migration. Connection can move with you as you switch networks. That’s cool. 
  • Process migration used to be a thing. Can’t say I ever thought of migrating WebWorkers. It’s really hard to believe given the power of phones these days that this would be win. And the curve is going the other way. As phones have more SoC integration they will only get more powerful. Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration: So you’ve got mobile devices without the computing power needed to deliver a great experience, and cloud computing that has all the needed power that’s too far away. Edge servers are the middle ground – more compute power than a mobile device, but with latency of just a few ms. The kind of edge server envisaged here might, for example, be integrated with your WiFi access point.
  • Data Migrations Don’t Have to Come with Downtime: Here we found the perfect abstraction we needed for a large-scale Redis migration; by adding Envoy in as a middle-man to all of our Redis instances, we could more intelligently coordinate the migration of data to our new Redis clusters while continuing to serve traffic to users. 
  • Why Starlink is a Big Deal
    • Duh: HFT firms will be the first customers, seeking a few-millisecond reduction in the transmission of realtime data between markets.
    • Several nations will soon summarily ban any use of the system, as it will very effectively bypass longstanding local monopolies on last-mile profit and censorship.
    • SpaceX will be pressured by the governments in the jurisdictions where their staff reside (where their physical threats of enforcement can be carried out) to selectively censor/blackout or otherwise wiretap use of the system
    • Multinationals with multiple locations will purchase private WAN systems from Starlink as emergency backups for their existing IP WAN/VPN network
    • The ground-based antenna system will be too large to be handheld (it’s described as “pizza-box” or “briefcase” sized), so some automakers will likely build them into the roofs of cars or trucks, which will then provide normal Wi-Fi signals to the occupants
    • Airplanes. You can’t bring your own antenna, so the airline will simply use Starlink for backhaul and continue to charge you 10x for a tenth of the speed available.
    • Cruise Ships. Same deal.
    • Alternately: this potentially makes full-time, international waters seasteading practically viable.
    • Wireless service providers will be able to use Starlink to provide backhaul for new GSM/LTE/5G towers, enabling mobile phone usage in places previously impractical. 
  • Why is Shopify moving to Reactive Native?
    • we learned from our acquisition of Tictail (a mobile first company that focused 100% on React Native) in 2018 how far React Native has come and made 3 deep product investments in 2019
    • Shopify uses React extensively on the web and that know-how is now transferable to mobile
    • we see the performance curve bending upwards (think what’s now possible in Google Docs vs. desktop Microsoft Office) and we can long-term invest in React Native like we do in Ruby, Rails, Kubernetes and Rich Media.
  • Extensive pro and con list for picking between serverless and kubernetes. But if you have to ask go serverless. Scaling My App: Serverless vs Kubernetes
  • Lumigo, after balancing latency, scalability and cost, saved 60%+ by switching from Kinesis Streams to Kinesis Firehose. Kinesis Firehose Advantages:  You pay only for what you use; It has higher limits by default than Streams; Overprovisioning is free of charge; It is fully manage service. Kinesis Firehose Disadvantages: No more real-time processing. In the worst case, it can take up to 60 seconds to process your data; You can’t directly control Firehose processing scaling yourself; There’s no visibility into your Kinesis Firehose configuration.
  • Obviously since this is from YugabyteDB you should take it from a grain of salt, but they are making a detailed series of articles on Distributed SQL. Start with What is Distributed SQL? for background. For a comparison between systems (Vitess, Citus, VoltDB, NuoDB, ClustrixDB) there’s Distributed SQL vs. NewSQL
    • NewSQL databases were first created in early 2010s to solve the write scalability challenge associated with monolithic SQL databases. They allowed multiple nodes to be used in the context of a SQL database without making any significant enhancements to the replication architecture. The cloud was still at its infancy at that time so this strategy worked for a few years. However as multi-zone, multi-region and multi-cloud cloud deployments became the standard for modern applications, these databases fell behind in developer adoption. On the other hand, distributed SQL databases like Google Spanner and YugabyteDB are built ground-up to exploit the full elasticity of the cloud and are also designed to work on inherently unreliable infrastructure.
    • Also, Internals of Google Cloud Spanner
  • Individually smart cars will not solve traffic jams. We actually need centralized command and control. Mathematicians have solved traffic jams, and they’re begging cities to listen: All drivers need to be on the same navigation system. Cars can only be efficiently rerouted if instructions come from one center hub. One navigation system rerouting some drivers does not solve traffic jams.
  • Cristos Goodrow: YouTube Algorithm. Not much actual detail here. The main algorithm that drives YouTube recommendations is collaborative filtering. And of course, advertising. 
  • Follow an evolution as old as distributed systems. Eventually in a distributed system you want to generate as much infrastructure as you can from an IDL. This allows you to standardize across all clients and the backend. Lyft’s Journey through Mobile Networking. Lyft landed on protobufs and saw reduced response payload sizes (sometimes cutting down size by > 50%), improved success rates, and faster response times on larger endpoints. Best of all, product teams were able to take advantage of these improvements without any changes to their code, thanks to the IDL abstraction in place.
  • Tinder on Taming ElastiCache with Auto-discovery at Scale. As you might imagine Tinder is highly read based. They started with a redis on EC2 and moved to ElastiCache so they didn’t have to manage that mess anymore. But it wasn’t  quirwe that simple. They ran into problems using the jedis Java client, namely failover intolerance. So they built a new client that would maintain a local and dynamic view of the cluster that would refresh in response to cluster topology changes and more fully utilizer clusters resources by performing read operations on cluster slave nodes. The result: they achieved their first true auto-failover with zero developer intervention and decreased latency on calls to  backend services and reduced the computational load on the cluster’s master nodes.
  • Scaling to Billions of Requests-The Serverless Way at Capital One. They use both Lambda and Spark, depending on the use case. 
    • The ideal streaming solution should address the following requirements: scaling, throttling, fault tolerance, resusability, monitoring. 
    • Serverless streaming is used for use cases like generating meaningful alerts based on customers’ transactions at thousands of events per second and for low volume events like card reissues that run in tens of events per second but they still use.
    • Spark is used for high volume computing and batch processing loads where data and compute functions can be distributed and performed in parallel, like for machine learning use cases, map/reduce paradigm involving several hundred files, long running processes that deal with petabytes of data, and processing large transaction files to generate customers’ spend profiles using machine learning model.

 Soft Stuff: 

  • Replicache: a per-user cache that sits between your backend and client. Whatever you put in the server replica gets sent as deltas to the client on next sync. Any changes you make on the client replica get forwarded as requests to your service’s APIs. Replicache guarantees that after each sync, a client replica will exactly match the server. @aboodman: Replicache makes it easy to create blazingly fast mobile, web, and desktop applications by making them offline-first. It’s using something sort of like a CRDT. A true CRDT is not needed because Replicache is not intended to work in a masterless environment. 
  • trekhleb/state-of-the-art-shitcode: This a list of state-of-the-art shitcode principles your project should follow to call it a proper shitcode.
  • Netflix-Skunkworks/riskquant (article): A library to assist in quantifying risk.
  • cdk-patterns/serverless: This is an example CDK stack to deploy The Scalable Webhook stack described by Jeremy Daly

Pub Stuff:

  • Automating Visual Privacy Protection Using a Smart LED: In this paper, we propose LiShield, a system that deters photographing of sensitive indoor physical space and automatically enforces location-bound visual privacy protection. LiShield protects the physical scenes against undesired recording without requiring user intervention and without disrupting the human visual perception. Our key idea is to illuminate the environment using smart LEDs, which are intensity-modulated following specialized waveforms. We design the waveform in such a way that its modulation pattern is imperceptible by human eyes but can interfere with the image sensors on mobile camera devices.
  • NSA on Mitigating Cloud Vulnerabilities: This document divides cloud vulnerabilities into four classes (misconfiguration, poor access control, shared tenancy vulnerabilities, and supply chain vulnerabilities) that encompass the vast majority of known vulnerabilities. Cloud customers have a critical role in mitigating misconfiguration and poor access control, but can also take actions to protect cloud resources from the exploitation of shared tenancy and supply chain vulnerabilities. Descriptions of each vulnerability class along with the most effective mitigations are provided to help organizations lock down their cloud resources. By taking a risk-based approach to cloud adoption, organizations can securely benefit from the cloud’s extensive capabilities.  
  • RSS++: load and state-aware receive side scaling (article, video, github) : While the current literature typically focuses on load-balancing among multiple servers, in this paper, we demonstrate the importance of load-balancing within a single machine (potentially with hundreds of CPU cores). In this context, we propose a new load-balancing technique (RSS++) that dynamically modifies the receive side scaling (RSS) indirection table to spread the load across the CPU cores in a more optimal way. RSS++ incurs up to 14x lower 95th percentile tail latency and orders of magnitude fewer packet drops compared to RSS under high CPU utilization. RSS++ allows higher CPU utilization and dynamic scaling of the number of allocated CPU cores to accommodate the input load while avoiding the typical 25% over-provisioning.
  • FLAIR: Accelerating Reads with Consistency-Aware Network Routing: We present FLAIR, a novel approach for accelerating read operations in leader-based consensus protocols. FLAIR leverages the capabilities of the new generation of programmable switches to serve reads from follower replicas without compromising consistency. The core of the new approach is a packet-processing pipeline that can track client requests and system replies, identify consistent replicas, and at line speed, forward read requests to replicas that can serve the read without sacrificing linearizability. An additional benefit of FLAIR is that it facilitates devising novel consistency-aware load balancing techniques. Our evaluation indicates that, compared to state-of-the-art alternatives, the proposed approach can bring significant performance gains: up to 42% higher throughput and 35-97% lower latency for most workloads. 
  • Trade-Offs Under Pressure: Heuristics and Observations Of Teams Resolving Internet Service Outages: Three diagnostic heuristics were identified as being in use: a) initially look for correlation between the behaviour and any recent changes made in the software, b) upon finding no correlation with a software change, widen the search to any potential contributors imagined, and c) when choosing a diagnostic direction, reduce it by focusing on the one that most easily comes to mind, either because symptoms match those of a difficult-to-diagnose event in the past, or those of any recent events. A fourth heuristic is coordinative in nature: when making changes to software in an effort to mitigate the untoward effects or to resolve the issue completely, rely on peer review of the changes more than automated testing (if at all.)

from High Scalability: http://feedproxy.google.com/~r/HighScalability/~3/U3KFjOHfdWU/stuff-the-internet-says-on-scalability-for-february-7th-2020.html