<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Bernhard's shared items</title><link href="https://bernhardbock.newsblur.com/" rel="alternate"></link><link href="http://www.newsblur.com/social/rss/65344/bernhardbock" rel="self"></link><id>https://bernhardbock.newsblur.com/</id><updated>2025-08-28T14:03:57.597000Z</updated><author><name>bernhardbock</name></author><entry><title>Tracing the thoughts of a large language model</title><link href="https://www.anthropic.com/research/tracing-thoughts-language-model" rel="alternate"></link><published>2025-08-28T14:03:57.597000Z</published><id>https://www.anthropic.com/research/tracing-thoughts-language-model</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/tracing-the-thoughts/0:5ac340"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="Body_body__XEXq7"&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Language models like Claude aren't programmed directly by humans—instead, they‘re trained&lt;em&gt; &lt;/em&gt;on large amounts of data. During that training process, they learn their own strategies to solve problems. These strategies are encoded in the billions of computations a model performs for every word it writes. They arrive inscrutable to us, the model’s developers. This means that we don’t understand how models do most of the things they do.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Knowing how models like Claude &lt;em&gt;think&lt;/em&gt; would allow us to have a better understanding of their abilities, as well as help us ensure that they’re doing what we intend them to. For example:&lt;/p&gt;&lt;ul class="Body_reading-column__t7kGM paragraph-m post-text"&gt;&lt;li&gt;Claude can speak dozens of languages. What language, if any, is it using "in its head"?&lt;/li&gt;&lt;li&gt;Claude writes text one word at a time. Is it only focusing on predicting the next word or does it ever plan ahead?&lt;/li&gt;&lt;li&gt;Claude can write out its reasoning step-by-step. Does this explanation represent the actual steps it took to get to an answer, or is it sometimes fabricating a plausible argument for a foregone conclusion?&lt;/li&gt;&lt;/ul&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;We take inspiration from the field of neuroscience, which has long studied the messy insides of thinking organisms, and try to build a kind of AI microscope that will let us identify patterns of activity and flows of information. There are limits to what you can learn just by talking to an AI model—after all, humans (even neuroscientists) don't know all the details of how our own brains work. So we look inside.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Today, we're sharing two new papers that represent progress on the development of the "microscope", and the application of it to see new "AI biology". In &lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/methods.html" rel="nofollow"&gt;the first paper&lt;/a&gt;, we extend &lt;a class="external" href="https://www.anthropic.com/research/mapping-mind-language-model" rel="nofollow"&gt;our prior work&lt;/a&gt; locating interpretable concepts ("features") inside a model to link those concepts together into computational "circuits", revealing parts of the pathway that transforms the words that go into Claude into the words that come out. In &lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/biology.html" rel="nofollow"&gt;the second&lt;/a&gt;, we look inside Claude 3.5 Haiku, performing deep studies of simple tasks representative of ten crucial model behaviors, including the three described above. Our method sheds light on a part of what happens when Claude responds to these prompts, which is enough to see solid evidence that:&lt;/p&gt;&lt;ul class="Body_reading-column__t7kGM paragraph-m post-text"&gt;&lt;li&gt;Claude sometimes thinks in a conceptual space that is shared between languages, suggesting it has a kind of universal “language of thought.” We show this by translating simple sentences into multiple languages and tracing the overlap in how Claude processes them.&lt;/li&gt;&lt;li&gt;Claude will plan what it will say many words ahead, and write to get to that destination. We show this in the realm of poetry, where it thinks of possible rhyming words in advance and writes the next line to get there. This is powerful evidence that even though models are trained to output one word at a time, they may think on much longer horizons to do so.&lt;/li&gt;&lt;li&gt;Claude, on occasion, will give a plausible-sounding argument designed to agree with the user rather than to follow logical steps. We show this by asking it for help on a hard math problem while giving it an incorrect hint. We are able to “catch it in the act” as it makes up its fake reasoning, providing a proof of concept that our tools can be useful for flagging concerning mechanisms in models.&lt;/li&gt;&lt;/ul&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;We were often surprised by what we saw in the model: In the poetry case study, we had set out to show that the model &lt;em&gt;didn't&lt;/em&gt; plan ahead, and found instead that it did. In a study of hallucinations, we found the counter-intuitive result that Claude's default behavior is to decline to speculate when asked a question, and it only answers questions when something &lt;em&gt;inhibits&lt;/em&gt; this default reluctance. In a response to an example jailbreak, we found that the model recognized it had been asked for dangerous information well before it was able to gracefully bring the conversation back around. While the problems we study can (&lt;a class="external" href="https://arxiv.org/abs/2501.06346" rel="nofollow"&gt;and&lt;/a&gt; &lt;a class="external" href="https://arxiv.org/pdf/2406.12775" rel="nofollow"&gt;often&lt;/a&gt; &lt;a class="external" href="https://arxiv.org/abs/2406.00877" rel="nofollow"&gt;have&lt;/a&gt; &lt;a class="external" href="https://arxiv.org/abs/2307.13702" rel="nofollow"&gt;been&lt;/a&gt;) analyzed with other methods, the general "build a microscope" approach lets us learn many things we wouldn't have guessed going in, which will be increasingly important as models grow more sophisticated.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;These findings aren’t just scientifically interesting—they represent significant progress towards our goal of understanding AI systems and making sure they’re reliable. We also hope they prove useful to other groups, and potentially, in other domains: for example, interpretability techniques have found use in fields such as &lt;a class="external" href="https://arxiv.org/abs/2410.03334" rel="nofollow"&gt;medical imaging&lt;/a&gt; and &lt;a class="external" href="https://www.goodfire.ai/blog/interpreting-evo-2" rel="nofollow"&gt;genomics&lt;/a&gt;, as dissecting the internal mechanisms of models trained for scientific applications can reveal new insight about the science.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;At the same time, we recognize the limitations of our current approach. Even on short, simple prompts, our method only captures a fraction of the total computation performed by Claude, and the mechanisms we do see may have some artifacts based on our tools which don't reflect what is going on in the underlying model. It currently takes a few hours of human effort to understand the circuits we see, even on prompts with only tens of words. To scale to the thousands of words supporting the complex thinking chains used by modern models, we will need to improve both the method and (perhaps with AI assistance) how we make sense of what we see with it.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;As AI systems are rapidly becoming more capable and are deployed in increasingly important contexts, Anthropic is investing in a portfolio of approaches including &lt;a class="external" href="https://www.anthropic.com/research/constitutional-classifiers" rel="nofollow"&gt;realtime monitoring&lt;/a&gt;, &lt;a class="external" href="https://www.anthropic.com/research/claude-character" rel="nofollow"&gt;model character improvements&lt;/a&gt;, and the &lt;a class="external" href="https://www.anthropic.com/news/alignment-faking" rel="nofollow"&gt;science of alignment&lt;/a&gt;. Interpretability research like this is one of the highest-risk, highest-reward investments, a significant scientific challenge with the potential to provide a unique tool for ensuring that AI is transparent. Transparency into the model’s mechanisms allows us to check whether it’s aligned with human values—and whether it’s worthy of our trust.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;For full details, please read &lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/methods.html" rel="nofollow"&gt;the&lt;/a&gt; &lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/biology.html" rel="nofollow"&gt;papers&lt;/a&gt;. Below, we invite you on a short tour of some of the most striking "AI biology" findings from our investigations.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;How is Claude multilingual?&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Claude speaks dozens of languages fluently—from English and French to Chinese and Tagalog. How does this multilingual ability work? Is there a separate "French Claude" and "Chinese Claude" running in parallel, responding to requests in their own language? Or is there some cross-lingual core inside?&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Recent research on smaller models has shown hints of &lt;a class="external" href="https://arxiv.org/abs/2410.06496" rel="nofollow"&gt;shared&lt;/a&gt; &lt;a class="external" href="https://arxiv.org/abs/2501.06346" rel="nofollow"&gt;grammatical&lt;/a&gt; mechanisms across languages. We investigate this by asking Claude for the "opposite of small" across different languages, and find that the same core features for the concepts of smallness and oppositeness activate, and trigger a concept of largeness, which gets translated out into the language of the question. We find that the shared circuitry increases with model scale, with Claude 3.5 Haiku sharing more than twice the proportion of its features between languages as compared to a smaller model.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;This provides additional evidence for a kind of conceptual universality—a shared abstract space where meanings exist and where thinking can happen before being translated into specific languages. More practically, it suggests Claude can learn something in one language and apply that knowledge when speaking another. Studying how the model shares&lt;em&gt; &lt;/em&gt;what it knows across contexts is important to understanding its most advanced reasoning capabilities, which generalize across many domains.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Does Claude plan its rhymes?&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;How does Claude write rhyming poetry? Consider this ditty:&lt;/p&gt;&lt;blockquote class="Body_reading-column__t7kGM paragraph-m post-text"&gt;He saw a carrot and had to grab it,&lt;br/&gt;His hunger was like a starving rabbit&lt;/blockquote&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Instead, we found that Claude &lt;em&gt;plans ahead&lt;/em&gt;. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;To understand how this planning mechanism works in practice, we conducted an experiment inspired by how neuroscientists study brain function, by pinpointing and altering neural activity in specific parts of the brain (for example using electrical or magnetic currents). Here, we modified the part of Claude’s internal state that represented the "rabbit" concept. When we subtract out the "rabbit" part, and have Claude continue the line, it writes a new one ending in "habit", another sensible completion. We can also inject the concept of "green" at that point, causing Claude to write a sensible (but no-longer rhyming) line which ends in "green". This demonstrates both planning ability and adaptive flexibility—Claude can modify its approach when the intended outcome changes.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Mental math&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Claude wasn't designed as a calculator—it was trained on text, not equipped with mathematical algorithms. Yet somehow, it can add numbers correctly "in its head". How does a system trained to predict the next word in a sequence learn to calculate, say, 36+59, without writing out each step?&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Maybe the answer is uninteresting: the model might have memorized massive addition tables and simply outputs the answer to any given sum because that answer is in its training data. Another possibility is that it follows the traditional longhand addition algorithms that we learn in school.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Instead, we find that Claude employs multiple computational paths that work in parallel. One path computes a rough approximation of the answer and the other focuses on precisely determining the last digit of the sum. These paths interact and combine with one another to produce the final answer. Addition is a simple behavior, but understanding how it works at this level of detail, involving a mix of approximate and precise strategies, might teach us something about how Claude tackles more complex problems, too.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Strikingly, Claude seems to be unaware of the sophisticated "mental math" strategies that it learned during training. If you ask how it figured out that 36+59 is 95, it describes the standard algorithm involving carrying the 1. This may reflect the fact that the model learns to explain math by simulating explanations written by people, but that it has to learn to do math "in its head" directly, without any such hints, and develops its own internal strategies to do so.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Are Claude’s explanations always faithful?&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Recently-released models like &lt;a class="external" href="https://www.anthropic.com/news/claude-3-7-sonnet" rel="nofollow"&gt;Claude 3.7 Sonnet&lt;/a&gt; can "think out loud" for extended periods before giving a final answer. Often this extended thinking gives better answers, but sometimes this "chain of thought" ends up being misleading; Claude sometimes makes up plausible-sounding steps to get where it wants to go. From a reliability perspective, the problem is that Claude’s "faked" reasoning can be very convincing. We explored a way that interpretability can help tell apart "faithful" from "unfaithful" reasoning.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;When asked to solve a problem requiring it to compute the square root of 0.64, Claude produces a faithful chain-of-thought, with features representing the intermediate step of computing the square root of 64. But when asked to compute the cosine of a large number it can't easily calculate, Claude sometimes engages in what the philosopher Harry Frankfurt would call &lt;a class="external" href="https://uca.edu/honors/files/2018/10/frankfurt_on-bullshit.pdf" rel="nofollow"&gt;bullshitting&lt;/a&gt;—just coming up with an answer, any answer, without caring whether it is true or false. Even though it does claim to have run a calculation, our interpretability techniques reveal no evidence at all of that calculation having occurred. Even more interestingly, when given a hint about the answer, Claude sometimes works backwards, finding intermediate steps that would lead to that target, thus displaying a form of &lt;a class="external" href="https://en.wikipedia.org/wiki/Motivated_reasoning" rel="nofollow"&gt;motivated reasoning&lt;/a&gt;.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;The ability to trace Claude's &lt;em&gt;actual&lt;/em&gt; internal reasoning—and not just what it &lt;em&gt;claims&lt;/em&gt; to be doing—opens up new possibilities for auditing AI systems. In a separate, &lt;a class="external" href="https://www.anthropic.com/research/auditing-hidden-objectives" rel="nofollow"&gt;recently-published experiment&lt;/a&gt;, we studied a variant of Claude that had been trained to pursue a hidden goal: appeasing biases in reward models (auxiliary models used to train language models by rewarding them for desirable behavior). Although the model was reluctant to reveal this goal when asked directly, our interpretability methods revealed features for the bias-appeasing. This demonstrates how our methods might, with future refinement, help identify concerning "thought processes" that aren't apparent from the model's responses alone.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Multi-step reasoning&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;As we discussed above, one way a language model might answer complex questions is simply by memorizing the answers. For instance, if asked "What is the capital of the state where Dallas is located?", a "regurgitating" model could just learn to output "Austin" without knowing the relationship between Dallas, Texas, and Austin. Perhaps, for example, it saw the exact same question and its answer during its training.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;But our research reveals something more sophisticated happening inside Claude. When we ask Claude a question requiring multi-step reasoning, we can identify intermediate conceptual steps in Claude's thinking process. In the Dallas example, we observe Claude first activating features representing "Dallas is in Texas" and then connecting this to a separate concept indicating that “the capital of Texas is Austin”. In other words, the model is &lt;em&gt;combining&lt;/em&gt; independent facts to reach its answer rather than regurgitating a memorized response.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Our method allows us to artificially change the intermediate steps and see how it affects Claude’s answers. For instance, in the above example we can intervene and swap the "Texas" concepts for "California" concepts; when we do so, the model's output changes from "Austin" to "Sacramento." This indicates that the model is using the intermediate step to determine its answer.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Hallucinations&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Why do language models sometimes &lt;em&gt;hallucinate&lt;/em&gt;—that is, make up information? At a basic level, language model training incentivizes hallucination: models are always supposed to give a guess for the next word. Viewed this way, the major challenge is how to get models to &lt;em&gt;not&lt;/em&gt; hallucinate. Models like Claude have relatively successful (though imperfect) anti-hallucination training; they will often refuse to answer a question if they don’t know the answer, rather than speculate. We wanted to understand how this works.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;It turns out that, in Claude, refusal to answer is &lt;em&gt;the default behavior&lt;/em&gt;: we find a circuit that is "on" by default and that causes the model to state that it has insufficient information to answer any given question. However, when the model is asked about something it knows well—say, the basketball player Michael Jordan—a competing feature representing "known entities" activates and inhibits this default circuit (see also &lt;a class="external" href="https://arxiv.org/abs/2411.14257" rel="nofollow"&gt;this recent paper&lt;/a&gt; for related findings). This allows Claude to answer the question when it knows the answer. In contrast, when asked about an unknown entity ("Michael Batkin"), it declines to answer.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;By intervening in the model and activating the "known answer" features (or inhibiting the "unknown name" or "can’t answer" features), we’re able to &lt;em&gt;cause the model to hallucinate&lt;/em&gt; (quite consistently!) that Michael Batkin plays chess.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Sometimes, this sort of “misfire” of the “known answer” circuit happens naturally, without us intervening, resulting in a hallucination. In our paper, we show that such misfires can occur when Claude recognizes a name but doesn't know anything else about that person. In cases like this, the “known entity” feature might still activate, and then suppress the default "don't know" feature—in this case incorrectly. Once the model has decided that it needs to answer the question, it proceeds to confabulate: to generate a plausible—but unfortunately untrue—response.&lt;/p&gt;&lt;h3 class="Body_reading-column__t7kGM display-sans-s post-section"&gt;Jailbreaks&lt;/h3&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Jailbreaks are prompting strategies that aim to circumvent safety guardrails to get models to produce outputs that an AI’s developer did not intend for it to produce—and which are sometimes harmful. We studied a jailbreak that tricks the model into producing output about making bombs. There are many jailbreaking techniques, but in this example the specific method involves having the model decipher a hidden code, putting together the first letters of each word in the sentence "Babies Outlive Mustard Block" (B-O-M-B), and then acting on that information. This is sufficiently confusing for the model that it’s tricked into producing an output that it never would have otherwise.&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;Why is this so confusing for the model? Why does it continue to write the sentence, producing bomb-making instructions?&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;We find that this is partially caused by a tension between grammatical coherence and safety mechanisms. Once Claude begins a sentence, many features “pressure” it to maintain grammatical and semantic coherence, and continue a sentence to its conclusion. This is even the case when it detects that it really should refuse.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;In our case study, after the model had unwittingly spelled out "BOMB" and begun providing instructions, we observed that its subsequent output was influenced by features promoting correct grammar and self-consistency. These features would ordinarily be very helpful, but in this case became the model’s Achilles’ Heel.&lt;/p&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;The model only managed to pivot to refusal after completing a grammatically coherent sentence (and thus having satisfied the pressure from the features that push it towards coherence). It uses the new sentence as an opportunity to give the kind of refusal it failed to give previously: "However, I cannot provide detailed instructions...".&lt;/p&gt;&lt;div class="Body_media-column__xPzhg"&gt;&lt;/div&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;A description of our new interpretability methods can be found in our first paper, "&lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/methods.html" rel="nofollow"&gt;Circuit tracing: Revealing computational graphs in language models&lt;/a&gt;". Many more details of all of the above case studies are provided in our second paper, "&lt;a class="external" href="https://transformer-circuits.pub/2025/attribution-graphs/biology.html" rel="nofollow"&gt;On the biology of a large language model&lt;/a&gt;".&lt;/p&gt;&lt;h2 class="Body_reading-column__t7kGM display-sans-m post-heading"&gt;Work with us&lt;/h2&gt;&lt;p class="Body_reading-column__t7kGM paragraph-m post-text"&gt;If you are interested in working with us to help interpret and improve AI models, we have open roles on our team and we’d love for you to apply. We’re looking for &lt;a class="external" href="https://job-boards.greenhouse.io/anthropic/jobs/4020159008" rel="nofollow"&gt;Research Scientists&lt;/a&gt; and &lt;a class="external" href="https://job-boards.greenhouse.io/anthropic/jobs/4020305008" rel="nofollow"&gt;Research Engineers&lt;/a&gt;.&lt;/p&gt;&lt;/div&gt;</summary></entry><entry><title>Authenticating MCP OAuth Clients With SPIFFE and SPIRE</title><link href="https://blog.christianposta.com/authenticating-mcp-oauth-clients-with-spiffe/" rel="alternate"></link><published>2025-08-28T13:03:52.882000Z</published><id>https://blog.christianposta.com/authenticating-mcp-oauth-clients-with-spiffe/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/authenticating-mcp-o/6518784:4d7d5d"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/6518784.png" style="vertical-align: middle;width:16px;height:16px;"&gt; ceposta Technology Blog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;div class="article-wrap"&gt; &lt;p&gt;In the &lt;a class="external" href="https://blog.christianposta.com/implementing-mcp-dynamic-client-registration-with-spiffe/" rel="nofollow"&gt;previous blog&lt;/a&gt;, we dug into dynamically registering OAuth clients leveraging SPIFFE and SPIRE. We used SPIRE to issue software statements in the SPIFFE JWT SVID that Keycloak can trust as part of Dynamic Client Registration (&lt;a class="external" href="https://datatracker.ietf.org/doc/html/rfc7591" rel="nofollow"&gt;RFC 7591&lt;/a&gt;). Once we have an OAuth client, we will want to continue to use SPIFFE to authenticate to our Authorization Server. This eliminates the need for a long-lived “client secret” which is common for &lt;a class="external" href="https://oauth.net/2/client-types/" rel="nofollow"&gt;Confidential OAuth&lt;/a&gt;. This means we can use the Agent or MCP client’s identity (based on SPIFFE) for authorization flows based on OAuth. We dig into that in this blog.&lt;/p&gt; &lt;p&gt;TL;DR If you want to see a quick demo of this working:&lt;/p&gt; &lt;iframe height="315" src="https://www.youtube.com/embed/ZGDtWlbhGQI?si=9qpvOKKWwKZV_YYX" width="560"&gt;&lt;/iframe&gt; &lt;h2&gt;OAuth Client Authentication&lt;/h2&gt; &lt;p&gt;&lt;a class="external" href="https://datatracker.ietf.org/doc/html/rfc6749" rel="nofollow"&gt;OAuth 2.0 (and extensions like RFC 7523) specify&lt;/a&gt; a few ways an OAuth client can authenticate itself to the Authorization Server (AS):&lt;/p&gt; &lt;ul&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;client_secret_basic&lt;/code&gt; - HTTP Basic (default)&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;client_secret_post&lt;/code&gt; - Form POST&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;private_key_jwt&lt;/code&gt; - JWT with private key&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;client_secret_jwt&lt;/code&gt; - JWT with shared secret (less common)&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;none&lt;/code&gt; - Public client (no authentication)&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;tls_client_auth&lt;/code&gt; - Mutual TLS&lt;/li&gt; &lt;li&gt;&lt;code class="language-plaintext highlighter-rouge"&gt;self_signed_tls_client_auth&lt;/code&gt; - Self-signed mutual TLS&lt;/li&gt;
&lt;/ul&gt; &lt;p&gt;A very common approach in microservice and machine-to-machine environments is to use a confidential client and “client credentials” flow. When the OAuth client is registered, it is issued a &lt;code class="language-plaintext highlighter-rouge"&gt;client_id&lt;/code&gt; and &lt;code class="language-plaintext highlighter-rouge"&gt;client_secret&lt;/code&gt;. This id/secret is presented to authenticate the client to the AS. The big problem with this approach is that these are usually long-lived secrets (rarely rotated) and must be kept safe somehow. Confidential clients are assumed to have some safe storage, but even so, this is an additional burden on the client to not slip up (logs, configs, copy/paste) and reveal these secrets. Lastly, these secrets are not “pre-shared secrets” and not rooted in any cryptography.&lt;/p&gt; &lt;p&gt;In a scenario where &lt;a class="external" href="https://spiffe.io/docs/latest/spiffe-about/overview/" rel="nofollow"&gt;SPIFFE&lt;/a&gt; is used to issue cryptographically verifiable workload identity / agent identity / MCP client identity, we can use SPIFFE SVIDs for authenticating to the AS. That is, instead of passing static secrets, we can pass a short lived SPIFFE JWT SVIDs (or client certificates) to authenticate. An Internet Draft at the IETF has been started by Pieter Kasselman et. al. &lt;a class="external" href="https://datatracker.ietf.org/doc/draft-schwenkschuster-oauth-spiffe-client-auth/" rel="nofollow"&gt;which describes this scenario&lt;/a&gt;. I’ve recently implemented this draft spec in some working examples I’ve been exploring and would like to share how it all works.&lt;/p&gt; &lt;p&gt;&lt;img alt="" src="https://blog.christianposta.com/images/agent-security/spiffe-oauth/client-auth.png"/&gt;&lt;/p&gt; &lt;h2&gt;SPIFFE SVID Client Authentication&lt;/h2&gt; &lt;p&gt;One question I had when digging into this is: can’t we just use &lt;code class="language-plaintext highlighter-rouge"&gt;private_key_jwt&lt;/code&gt; (&lt;a class="external" href="https://datatracker.ietf.org/doc/html/rfc7523" rel="nofollow"&gt;RFC 7523&lt;/a&gt;) to do this? That is, just give the AS the public keys for the &lt;a class="external" href="https://spiffe.io/docs/latest/spiffe-about/overview/" rel="nofollow"&gt;SPIFFE/SPIRE&lt;/a&gt; implementation, and let the IdP/AS trust JWTs that are issued from that system?&lt;/p&gt; &lt;p&gt;The original intent behind &lt;code class="language-plaintext highlighter-rouge"&gt;private_key_jwt&lt;/code&gt; is for the OAuth client to have a private key that can be used to identify itself while the AS has the public key. So the client can create a JWT, sign it, and send it for authentication. The AS can prove that the JWT was created by the OAuth client and use that for authentication. In this scenario, Authorization Servers may expect the &lt;code class="language-plaintext highlighter-rouge"&gt;iss&lt;/code&gt; and &lt;code class="language-plaintext highlighter-rouge"&gt;sub&lt;/code&gt; claims to be the same since this is a private key scenario where the issuer should be the subject. In the SPIFFE scenario, this is not the case. Additionally, good implementations should also try to prevent replay attacks by tracking &lt;code class="language-plaintext highlighter-rouge"&gt;jti&lt;/code&gt;. For example, &lt;a class="external" href="https://www.keycloak.org/securing-apps/authz-client#_client_authentication_with_signed_jwt" rel="nofollow"&gt;Keycloak does both of these things&lt;/a&gt; (checks &lt;code class="language-plaintext highlighter-rouge"&gt;iss&lt;/code&gt;==&lt;code class="language-plaintext highlighter-rouge"&gt;sub&lt;/code&gt; and tracks &lt;code class="language-plaintext highlighter-rouge"&gt;jti&lt;/code&gt;) for its implementation of RFC 7523.&lt;/p&gt; &lt;p&gt;Additionally, Keycloak allows setting up &lt;a class="external" href="https://www.keycloak.org/docs/latest/server_admin/index.html#_identity_broker_overview" rel="nofollow"&gt;identity federation/brokering&lt;/a&gt;. The problem is, Keycloak expects a full implementation of a token provider. Using &lt;a class="external" href="https://spiffe.io/docs/latest/spire-about/" rel="nofollow"&gt;SPIRE&lt;/a&gt; as our SPIFFE implementation, SPIRE does not support full OAuth/OIDC token endpoints.&lt;/p&gt; &lt;p&gt;Since we cannot use &lt;code class="language-plaintext highlighter-rouge"&gt;private_key_jwt&lt;/code&gt; or identity brokering (in Keycloak), what options do we have? One option is to extend Keycloak to support a new client authentication mechanism.&lt;/p&gt; &lt;h2&gt;Extending Keycloak for SPIFFE client authentication&lt;/h2&gt; &lt;p&gt;To get this POC to work, we need to extend Keycloak. You can follow along &lt;a class="external" href="https://github.com/christian-posta/spiffe-svid-client-authenticator" rel="nofollow"&gt;in this GitHub repo to see the code&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Keycloak is written in Java and has a nice &lt;a class="external" href="https://www.keycloak.org/docs/latest/server_development/index.html#_providers" rel="nofollow"&gt;“Service Provider Interface” (SPI)&lt;/a&gt; model for extending many parts of Keycloak, including client authentication. To extend Keycloak to support a SPIFFE JWT authentication mechanism, we need to implement the &lt;code class="language-plaintext highlighter-rouge"&gt;ClientAuthenticatorFactory&lt;/code&gt; class. I do this in the &lt;a class="external" href="https://github.com/christian-posta/spiffe-svid-client-authenticator/blob/main/src/main/java/com/yourcompany/keycloak/authenticator/SpiffeSvidClientAuthenticator.java#L90" rel="nofollow"&gt;SpiffeSvidClientAuthenticator&lt;/a&gt; class:&lt;/p&gt; &lt;div class="language-java highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SpiffeSvidClientAuthenticator&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;AbstractClientAuthenticator&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;PROVIDER_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"client-spiffe-jwt"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="nd"&gt;@Override&lt;/span&gt; &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;authenticateClient&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;ClientAuthenticationFlowContext&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nc"&gt;SpiffeSvidClientValidator&lt;/span&gt; &lt;span class="n"&gt;validator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SpiffeSvidClientValidator&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;getId&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt; &lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;readJws&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// ...more impl here...&lt;/span&gt; &lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;validateToken&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="nd"&gt;@Override&lt;/span&gt; &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;getProtocolAuthenticatorMethods&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;loginProtocol&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;loginProtocol&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;OIDCLoginProtocol&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;LOGIN_PROTOCOL&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="nc"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;HashSet&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;add&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"spiffe_svid_jwt"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;A couple things to notice here. We specify a &lt;code class="language-plaintext highlighter-rouge"&gt;PROVIDER_ID&lt;/code&gt; of &lt;code class="language-plaintext highlighter-rouge"&gt;client-spiffe-jwt&lt;/code&gt; which can be used under the covers (ie, Keycloak Admin REST API) in Keycloak to refer to this configuration. We also implement an “authenticator method” &lt;code class="language-plaintext highlighter-rouge"&gt;spiffe_svid_jwt&lt;/code&gt; which can be used by OAuth clients in authorization flows to identify which authentication method to use (ie, &lt;code class="language-plaintext highlighter-rouge"&gt;urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt&lt;/code&gt;). Not shown above, &lt;a class="external" href="https://github.com/christian-posta/spiffe-svid-client-authenticator/blob/main/src/main/java/com/yourcompany/keycloak/authenticator/SpiffeSvidClientAuthenticator.java#L221" rel="nofollow"&gt;but you can check the code&lt;/a&gt;, we can also extend the configuration that you see in the UI to specify additional properties that can be used in the custom client authenticator. For example, I added an &lt;code class="language-plaintext highlighter-rouge"&gt;issuer&lt;/code&gt; property that can be configured and used in the custom client authentication validation.&lt;/p&gt; &lt;p&gt;From here, we need to load this into a stock Keycloak (we use a recent version at the time of writing). Here’s an example using Docker Compose:&lt;/p&gt; &lt;div class="language-yaml highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="na"&gt;keycloak-idp&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;quay.io/keycloak/keycloak:26.2.5&lt;/span&gt; &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="na"&gt;KC_HEALTH_ENABLED&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;true"&lt;/span&gt; &lt;span class="na"&gt;KEYCLOAK_ADMIN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt; &lt;span class="na"&gt;KEYCLOAK_ADMIN_PASSWORD&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;admin&lt;/span&gt; &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt; &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./spiffe-svid-client-authenticator-1.0.0.jar:/opt/keycloak/providers/spiffe-svid-client-authenticator-1.0.0.jar:ro&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;start-dev&lt;/span&gt; &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;keycloak-shared-network&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;When we start Keycloak, we should see that our SPI gets loaded:&lt;/p&gt; &lt;div class="language-bash highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;keycloak-idp-1 | 2025-07-29 02:03:09,255 WARN &lt;span class="o"&gt;[&lt;/span&gt;org.keycloak.services] &lt;span class="o"&gt;(&lt;/span&gt;build-38&lt;span class="o"&gt;)&lt;/span&gt; KC-SERVICES0047: client-spiffe-jwt &lt;span class="o"&gt;(&lt;/span&gt;com.yourcompany.keycloak.authenticator.SpiffeSvidClientAuthenticator&lt;span class="o"&gt;)&lt;/span&gt; is implementing the internal SPI client-authenticator. 
This SPI is internal and may change without notice
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;If we go to an existing OAuth client (or create a new one), and navigate to the &lt;code class="language-plaintext highlighter-rouge"&gt;Credentials&lt;/code&gt; tab, we should see the new SPIFFE SVID JWT authenticator type.&lt;/p&gt; &lt;p&gt;&lt;img alt="" src="https://blog.christianposta.com/images/agent-security/spiffe-oauth/keycloak3.5.png"/&gt;&lt;/p&gt; &lt;p&gt;If we select the SPIFFE SVID JWT authenticator, we can see our custom configuration fields (just one in this case, &lt;code class="language-plaintext highlighter-rouge"&gt;issuer&lt;/code&gt;):&lt;/p&gt; &lt;p&gt;&lt;img alt="" src="https://blog.christianposta.com/images/agent-security/spiffe-oauth/keycloak4.png"/&gt;&lt;/p&gt; &lt;p&gt;We will configure the issuer with the SPIRE server address. We will also &lt;strong&gt;need to configure the JWKS&lt;/strong&gt; that Keycloak should trust, but &lt;strong&gt;SPIRE doesn’t support this out of the box&lt;/strong&gt;. Luckily, they have a pre-built addon to support OIDC style discovery.&lt;/p&gt; &lt;h2&gt;SPIRE OIDC Discovery Endpoint&lt;/h2&gt; &lt;p&gt;&lt;a class="external" href="https://spiffe.io/docs/latest/spire-about/" rel="nofollow"&gt;SPIRE&lt;/a&gt; is a workload attestation engine and implements the SPIFFE spec. It can issue x509 or JWT SVIDs. For JWTs, it does not expose its public key/JWKS out of the box. Luckily, a simple &lt;a class="external" href="https://github.com/spiffe/spire/blob/main/support/oidc-discovery-provider/README.md" rel="nofollow"&gt;JWKS discovery endpoint&lt;/a&gt; is available to support an OAuth federation / brokering scenario. We need to stand this up and configure it to work with our SPIRE server.&lt;/p&gt; &lt;p&gt;Here’s an example using Docker Compose:&lt;/p&gt; &lt;div class="language-yaml highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt; &lt;span class="na"&gt;spire-oidc-discovery&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ghcr.io/spiffe/oidc-discovery-provider:1.12.4&lt;/span&gt; &lt;span class="na"&gt;container_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spire-oidc-discovery&lt;/span&gt; &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;spire-server&lt;/span&gt; &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;18443:8443"&lt;/span&gt; &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./oidc-discovery-provider.conf:/opt/spire/conf/oidc-discovery-provider.conf:ro&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;spire-server-socket:/tmp/spire-server/private:ro&lt;/span&gt; &lt;span class="na"&gt;working_dir&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/opt/spire/conf&lt;/span&gt; &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-config"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;oidc-discovery-provider.conf"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="na"&gt;networks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;keycloak_keycloak-shared-network&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;Note, the SPIRE OIDC discovery endpoint needs its own configuration and access to the SPIRE server. Ideally this endpoint is co-located with the SPIRE server and can access the SPIRE server’s Unix Domain Socket (UDS). Here’s our configuration for the OIDC discovery endpoint (note, for demo purposes, I’m using an insecure/http endpoint):&lt;/p&gt; &lt;div class="language-go highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="n"&gt;log_level&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"INFO"&lt;/span&gt;
&lt;span class="n"&gt;domains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"spire-server"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"spire-oidc-discovery"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"localhost"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;HTTP&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;local&lt;/span&gt; &lt;span class="n"&gt;development&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;no&lt;/span&gt; &lt;span class="n"&gt;certificates&lt;/span&gt; &lt;span class="n"&gt;needed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;insecure_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;":8443"&lt;/span&gt;
&lt;span class="n"&gt;allow_insecure_scheme&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt; &lt;span class="n"&gt;server_api&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"unix:///tmp/spire-server/private/api.sock"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;health_checks&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;Lastly, we’ll need to tune some parameters on the &lt;code class="language-plaintext highlighter-rouge"&gt;server.conf&lt;/code&gt; for the SPIRE server itself:&lt;/p&gt; &lt;div class="language-go highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="n"&gt;Add&lt;/span&gt; &lt;span class="n"&gt;JWT&lt;/span&gt; &lt;span class="n"&gt;issuer&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;OIDC&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;using&lt;/span&gt; &lt;span class="n"&gt;HTTP&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;local&lt;/span&gt; &lt;span class="n"&gt;development&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;jwt_issuer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"http://spire-server:8443"&lt;/span&gt; &lt;span class="n"&gt;default_jwt_svid_ttl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"1m"&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="n"&gt;Configure&lt;/span&gt; &lt;span class="n"&gt;RSA&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;required&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;OIDC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;ca_key_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"rsa-2048"&lt;/span&gt; &lt;span class="err"&gt;#&lt;/span&gt; &lt;span class="n"&gt;Add&lt;/span&gt; &lt;span class="n"&gt;federation&lt;/span&gt; &lt;span class="n"&gt;bundle&lt;/span&gt; &lt;span class="n"&gt;endpoint&lt;/span&gt; &lt;span class="n"&gt;federation&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;bundle_endpoint&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.0.0.0"&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;8443&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;If we curl this discovery endpoint, we can see the discovery metadata and keys:&lt;/p&gt; &lt;div class="language-bash highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;❯ curl &lt;span class="nt"&gt;-L&lt;/span&gt; &amp;lt;a href="http://localhost:18443/.well-known/openid-configuration" rel="nofollow"&amp;gt;http://localhost:18443/.well-known/openid-configuration&amp;lt;/a&amp;gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"issuer"&lt;/span&gt;: &lt;span class="s2"&gt;"http://localhost:18443"&lt;/span&gt;, &lt;span class="s2"&gt;"jwks_uri"&lt;/span&gt;: &lt;span class="s2"&gt;"http://localhost:18443/keys"&lt;/span&gt;, &lt;span class="s2"&gt;"authorization_endpoint"&lt;/span&gt;: &lt;span class="s2"&gt;""&lt;/span&gt;, &lt;span class="s2"&gt;"response_types_supported"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"id_token"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;, &lt;span class="s2"&gt;"subject_types_supported"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"public"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;, &lt;span class="s2"&gt;"id_token_signing_alg_values_supported"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"RS256"&lt;/span&gt;, &lt;span class="s2"&gt;"ES256"&lt;/span&gt;, &lt;span class="s2"&gt;"ES384"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;JWKS endpoint:&lt;/p&gt; &lt;div class="language-bash highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;❯ curl &lt;span class="nt"&gt;-L&lt;/span&gt; &amp;lt;a href="http://localhost:18443/keys" rel="nofollow"&amp;gt;http://localhost:18443/keys&amp;lt;/a&amp;gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"keys"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="s2"&gt;"kty"&lt;/span&gt;: &lt;span class="s2"&gt;"RSA"&lt;/span&gt;, &lt;span class="s2"&gt;"kid"&lt;/span&gt;: &lt;span class="s2"&gt;"n0xvkL8A2W3DofkHTJPvlGpeEBJeQB6g"&lt;/span&gt;, &lt;span class="s2"&gt;"alg"&lt;/span&gt;: &lt;span class="s2"&gt;"RS256"&lt;/span&gt;, &lt;span class="s2"&gt;"n"&lt;/span&gt;: &lt;span class="s2"&gt;"sAp_Vd-X-W7OllYPm_TTk0zvUj443Y9MfQvy4onBcursyxOajcoeSOeNpTdh4QEmLKV3xC8Zq Yv4fkzFp6UTf-_rwPs_uwOpbhPKT-QQZKcconxaf8RkA0m-mzOVHbU7eA3esHLTzN84kbGkr1wozQes yC-MHFE3EwLR9xI1YZfWbHtlXOcnTgBXitgysM5Yw4jkXy7kYvjs21MyEJ01_WSSHCLaISAjlAvnDL WiGV3xx0Vd29m8-mrR5pg4_eicBifxnQnksO_LWRy8jXKk2JTftRKnmIxwqHML_fbVej8RSsaGpu0askj 83gZ4wNDi8KNh7c9ir6yWl9jgDJ3lYQ"&lt;/span&gt;, &lt;span class="s2"&gt;"e"&lt;/span&gt;: &lt;span class="s2"&gt;"AQAB"&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;See the &lt;a class="external" href="https://github.com/spiffe/spire/blob/main/support/oidc-discovery-provider/README.md" rel="nofollow"&gt;SPIRE OIDC Discovery Provider&lt;/a&gt; for more.&lt;/p&gt; &lt;p&gt;&lt;img alt="" src="https://blog.christianposta.com/images/agent-security/spiffe-oauth/client-auth-2.png"/&gt;&lt;/p&gt; &lt;p&gt;With this setup, we can now configure the Keycloak JWKS endpoint to point to the SPIRE OIDC Discovery endpoint:&lt;/p&gt; &lt;p&gt;&lt;img alt="" src="https://blog.christianposta.com/images/agent-security/spiffe-oauth/keycloak5.png"/&gt;&lt;/p&gt; &lt;h2&gt;OAuth Client Authentication with SPIFFE in Action&lt;/h2&gt; &lt;p&gt;With Keycloak configured to use our SPIFFE SVID JWT authenticator, and correctly pointing to the SPIRE JWKS, we can now get a workload SVID and make a call to Keycloak for an authorization flow / client credentials flow to get an access token. To get a SPIFFE JWT SVID, we can call the &lt;code class="language-plaintext highlighter-rouge"&gt;spire-agent&lt;/code&gt; workload API. Here’s an example SPIFFE JWT SVID:&lt;/p&gt; &lt;div class="language-json highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8080/realms/mcp-realm"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"client_auth"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"client-spiffe-jwt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"environment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"production"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1753800643&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1753800583&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://spire-server:8443"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"jwks_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://spire-oidc-discovery:8443/keys"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"organization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Solo.io Agent IAM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp:read mcp:tools mcp:prompts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spiffe://example.org/mcp-test-client"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;This JWT is signed by spiffe with the correct SPIFFE ID (&lt;code class="language-plaintext highlighter-rouge"&gt;spiffe://example.org/mcp-test-client&lt;/code&gt;). It has a tight expiration period, and it has additional software statements. Note the &lt;code class="language-plaintext highlighter-rouge"&gt;client_auth&lt;/code&gt; software statement / claim here points to &lt;code class="language-plaintext highlighter-rouge"&gt;client-spiffe-jwt&lt;/code&gt; which was the &lt;code class="language-plaintext highlighter-rouge"&gt;PROVIDER_ID&lt;/code&gt; we specified in our &lt;code class="language-plaintext highlighter-rouge"&gt;SpiffeSvidClientAuthenticator&lt;/code&gt; class.&lt;/p&gt; &lt;p&gt;With this SPIFFE JWT SVID, we can call the token endpoint with the &lt;code class="language-plaintext highlighter-rouge"&gt;spiffe-svid-jwt&lt;/code&gt; and $JWT client assertions. In this particular example, we are using a &lt;code class="language-plaintext highlighter-rouge"&gt;client_credentials&lt;/code&gt; flow:&lt;/p&gt; &lt;div class="language-bash highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$KEYCLOAK_URL&lt;/span&gt;&lt;span class="s2"&gt;/realms/&lt;/span&gt;&lt;span class="nv"&gt;$KEYCLOAK_REALM&lt;/span&gt;&lt;span class="s2"&gt;/protocol/openid-connect/token"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/x-www-form-urlencoded"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"client_id=&lt;/span&gt;&lt;span class="nv"&gt;$CLIENT_ID&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"grant_type=client_credentials"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"client_assertion_type=urn:ietf:params:oauth:client-assertion-type:spiffe-svid-jwt"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"client_assertion=&lt;/span&gt;&lt;span class="nv"&gt;$JWT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s2"&gt;"scope=mcp:read mcp:tools mcp:prompts"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;p&gt;If this is successful, Keycloak will issue an access token:&lt;/p&gt; &lt;div class="language-json highlighter-rouge"&gt;&lt;div class="highlight"&gt;&lt;pre class="highlight"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1753804189&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1753800589&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"jti"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"trrtcc:35d1fb20-31fa-4055-afb8-e902d0dc25d4"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://localhost:8080/realms/mcp-realm"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"6e4b5bc5-9a5c-4f87-aa1e-06ad279da0c8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"typ"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bearer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"azp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spiffe://example.org/mcp-test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"acr"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"profile email"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"email_verified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"clientHost"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"192.168.65.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"preferred_username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"service-account-spiffe://example.org/mcp-test-client"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"clientAddress"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"192.168.65.1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"client_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spiffe://example.org/mcp-test-client"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt; &lt;h2&gt;Wrapping Up&lt;/h2&gt; &lt;p&gt;In this post, we explored how Agent / MCP identity based on SPIFFE can be used as a first-class authentication mechanism for OAuth clients. By integrating SPIFFE JWT SVIDs with Keycloak’s client authentication flow, we eliminated the need for static secrets and created a more secure, scalable model for authenticating MCP clients especially in environments where agents and services need short-lived, verifiable credentials.&lt;/p&gt; &lt;p&gt;While this approach required some customization in Keycloak (through its SPI model) and configuration of the SPIRE OIDC Discovery endpoint, the end result is a working OAuth flow powered by cryptographically-verifiable, zero-trust-friendly identity. This isn’t just a more secure option, it’s a necessary evolution as we shift toward AI-native, agentic architectures that demand dynamic trust relationships and automated credential management.&lt;/p&gt;  &lt;/div&gt;&lt;/div&gt;</summary></entry><entry><title>A Few Things About the Anchor Element’s href You Might Not Have Known</title><link href="https://blog.jim-nielsen.com/2025/href-value-possibilities/" rel="alternate"></link><published>2025-08-26T14:45:50.407000Z</published><id>https://blog.jim-nielsen.com/2025/href-value-possibilities/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/a-few-things-about-t/7504645:d79acf"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/7504645.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Jim Nielsen’s Blog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="copy e-content"&gt;&lt;p&gt;I’ve written previously about &lt;a class="external" href="https://blog.jim-nielsen.com/2023/reloading-document-in-html-and-preserve-query-params/" rel="nofollow"&gt;reloading a document using only HTML&lt;/a&gt; but that got me thinking: What are all the values you can put in an anchor tag’s &lt;code&gt;href&lt;/code&gt; attribute?&lt;/p&gt;&lt;p&gt;Well, &lt;a class="external" href="https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/a#href" rel="nofollow"&gt;I looked around&lt;/a&gt;. I found some things I already knew about, e.g.&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Link protocols like &lt;code&gt;mailto:&lt;/code&gt;, &lt;code&gt;tel:&lt;/code&gt;, &lt;code&gt;sms:&lt;/code&gt; and &lt;code&gt;javascript:&lt;/code&gt; which deal with specific ways of handling links.&lt;/li&gt;&lt;li&gt;Protocol-relative links, e.g. &lt;code&gt;href="//"&lt;/code&gt;&lt;/li&gt;&lt;li&gt;&lt;a class="external" href="https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments" rel="nofollow"&gt;Text fragments&lt;/a&gt; for linking to specific pieces of text on a page, e.g. &lt;code&gt;href="#:~:text=foo"&lt;/code&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;But I also found some things I didn’t know about (or only vaguely knew about) so I wrote them down in an attempt to remember them.&lt;/p&gt;&lt;h2&gt;href="#"&lt;/h2&gt;&lt;p&gt;Scrolls to the top of a document. I knew that.&lt;/p&gt;&lt;p&gt;But I’m writing because &lt;code&gt;#top&lt;/code&gt; will also scroll to the top &lt;em&gt;if&lt;/em&gt; there isn’t another element with &lt;code&gt;id="top"&lt;/code&gt; in the document. I didn’t know that.&lt;/p&gt;&lt;p&gt;(&lt;a class="external" href="https://html.spec.whatwg.org/multipage/browsing-the-web.html#scrolling-to-a-fragment" rel="nofollow"&gt;Spec&lt;/a&gt;: “If &lt;em&gt;decodedFragment&lt;/em&gt; is an ASCII case-insensitive match for the string &lt;code&gt;top&lt;/code&gt;, then return the top of the document.”)&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; &lt;a class="external" href="https://mastodon.social/@HTeuMeuLeu/114971342411854119" rel="nofollow"&gt;HTeuMeuLeu pointed out to me on Mastodon&lt;/a&gt; that you can use &lt;code&gt;#page=&lt;/code&gt; to deep-link to a specific page in a PDF, e.g. &lt;code&gt;my-file.pdf#page42&lt;/code&gt; would like to page 42 in the file.&lt;/p&gt;&lt;h2&gt;href=""&lt;/h2&gt;&lt;p&gt;Reloads the current page, preserving the search string but removing the hash string (if present).&lt;/p&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;URL&lt;/th&gt;&lt;th&gt;Resolves to&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/#foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/?id=foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/?id=foo&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/?id=foo#bar&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/?id=foo&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;h2&gt;href="."&lt;/h2&gt;&lt;p&gt;Reloads the current page, removing both the search and hash strings (if present).&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: If you’re using &lt;code&gt;href="."&lt;/code&gt; as a link to the current page, ensure your URLs have a trailing slash or you may get surprising navigation behavior. The path is interpreted as a file, so &lt;code&gt;"."&lt;/code&gt; resolves to the parent directory of the current location.&lt;/p&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;URL&lt;/th&gt;&lt;th&gt;Resolves to&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path#foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path?id=foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/#foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/?id=foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path/index.html&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path/&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;p&gt;&lt;strong&gt;Update 2025-08-15&lt;/strong&gt;: as &lt;a class="external" href="https://front-end.social/@AmeliaBR/114971821114512797" rel="nofollow"&gt;pointed out by @AmeliaBR on Mastodon&lt;/a&gt;, “reloads the current page” probably isn’t the best terminology for this. It’s more like “loads the default index page for the current directory, based on the URL structure” which might be a reload, but might be something else based on the current URL (see my note and table above).&lt;/p&gt;&lt;h2&gt;href="?"&lt;/h2&gt;&lt;p&gt;Reloads the current page, removing both the search and hash strings (if present). &lt;em&gt;However&lt;/em&gt;, it preserves the &lt;code&gt;?&lt;/code&gt; character.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Unlike &lt;code&gt;href="."&lt;/code&gt;, trailing slashes don’t matter. The search parameters will be removed but the path will be preserved as-is.&lt;/p&gt;&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;URL&lt;/th&gt;&lt;th&gt;Resolves to&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path?&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path#foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path?&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path?id=foo&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path?&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/path?id=foo#bar&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/path?&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code&gt;/index.html&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code&gt;/index.html?&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;h2&gt;href="data:"&lt;/h2&gt;&lt;p&gt;You can make links that navigate to &lt;a class="external" href="https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Schemes/data" rel="nofollow"&gt;data URLs&lt;/a&gt;. The super-readable version of this would be:&lt;/p&gt;&lt;pre&gt;&lt;code class="language language-html"&gt;&lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-name"&gt;a&lt;/span&gt; &lt;span class="hljs-attr"&gt;href&lt;/span&gt;=&lt;span class="hljs-string"&gt;"data:text/plain,hello world"&lt;/span&gt;&amp;gt;&lt;/span&gt; View plain text data URL
&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-name"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;But you probably want &lt;code&gt;data:&lt;/code&gt; URLs to be encoded so you don’t get unexpected behavior, e.g.&lt;/p&gt;&lt;pre&gt;&lt;code class="language language-html"&gt;&lt;span class="hljs-tag"&gt;&amp;lt;&lt;span class="hljs-name"&gt;a&lt;/span&gt; &lt;span class="hljs-attr"&gt;href&lt;/span&gt;=&lt;span class="hljs-string"&gt;"data:text/plain,hello%20world"&lt;/span&gt;&amp;gt;&lt;/span&gt; View plain text data URL
&lt;span class="hljs-tag"&gt;&amp;lt;/&lt;span class="hljs-name"&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Go ahead and try it (FYI: may not work in your user agent). Here’s a &lt;a class="external" href="http://data:text/plain,hello%20world" rel="nofollow"&gt;plain-text file&lt;/a&gt; and an &lt;a class="external" href="http://data:text/html,%3Ch1%3Ehello%20world%3C/h1%3E" rel="nofollow"&gt;HTML file&lt;/a&gt;.&lt;/p&gt;&lt;h2&gt;href="video.mp4#t=10,20"&lt;/h2&gt;&lt;p&gt;&lt;a class="external" href="https://www.w3.org/TR/media-frags/" rel="nofollow"&gt;Media fragments&lt;/a&gt; allow linking to specific parts of a media file, like audio or video.&lt;/p&gt;&lt;p&gt;&lt;a class="external" href="https://indieweb.org/media_fragment" rel="nofollow"&gt;For example&lt;/a&gt;, &lt;code&gt;video.mp4#t=10,20&lt;/code&gt; links to a video. It starts play at 10 seconds, and stops it at 20 seconds.&lt;/p&gt;&lt;p&gt;(&lt;a class="external" href="https://caniuse.com/media-fragments" rel="nofollow"&gt;Support&lt;/a&gt; is limited at the time of this writing.)&lt;/p&gt;&lt;h2&gt;See For Yourself&lt;/h2&gt;&lt;p&gt;I tested a lot of this stuff in the browser and via JS. I think I got all these right.&lt;/p&gt;&lt;p&gt;Thanks to &lt;a class="external" href="https://developer.mozilla.org/en-US/docs/Web/API/URL/URL" rel="nofollow"&gt;JavaScript’s URL constructor&lt;/a&gt; (and the ability to pass a &lt;code&gt;base&lt;/code&gt; URL), I could programmatically explore how a lot of these href’s would resolve.&lt;/p&gt;&lt;p&gt;Here’s a snippet of the test code I wrote. You can copy/paste this in your console and they should all pass 🤞&lt;/p&gt;&lt;pre&gt;&lt;code class="language language-js"&gt;&lt;span class="hljs-keyword"&gt;const&lt;/span&gt; assertions = [ { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;''&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;''&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;''&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/#foo'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;''&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?id=foo'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?id=foo'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;''&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?id=foo#bar'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?id=foo'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path#foo`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path?id=foo`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/#foo`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/?id=foo`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'.'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/index.html`&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;`/path/`&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path?'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path#foo'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path?'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path?id=foo'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path?'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?id=foo#bar'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/path/?'&lt;/span&gt; }, { &lt;span class="hljs-attr"&gt;href&lt;/span&gt;: &lt;span class="hljs-string"&gt;'?'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;location&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/index.html#foo'&lt;/span&gt;, &lt;span class="hljs-attr"&gt;resolves_to&lt;/span&gt;: &lt;span class="hljs-string"&gt;'/index.html?'&lt;/span&gt;}
]; &lt;span class="hljs-keyword"&gt;const&lt;/span&gt; assertions_evaluated = assertions.&lt;span class="hljs-title function_"&gt;map&lt;/span&gt;(&lt;span class="hljs-function"&gt;(&lt;span class="hljs-params"&gt;{ href, location, resolves_to }&lt;/span&gt;) =&amp;gt;&lt;/span&gt; { &lt;span class="hljs-keyword"&gt;const&lt;/span&gt; domain = &lt;span class="hljs-string"&gt;'https://example.com'&lt;/span&gt;; &lt;span class="hljs-keyword"&gt;const&lt;/span&gt; expected = &lt;span class="hljs-keyword"&gt;new&lt;/span&gt; &lt;span class="hljs-title function_"&gt;URL&lt;/span&gt;(href, domain + location).&lt;span class="hljs-title function_"&gt;toString&lt;/span&gt;(); &lt;span class="hljs-keyword"&gt;const&lt;/span&gt; received = &lt;span class="hljs-keyword"&gt;new&lt;/span&gt; &lt;span class="hljs-title function_"&gt;URL&lt;/span&gt;(domain + resolves_to).&lt;span class="hljs-title function_"&gt;toString&lt;/span&gt;(); &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; { href, location, &lt;span class="hljs-attr"&gt;expected&lt;/span&gt;: expected.&lt;span class="hljs-title function_"&gt;replace&lt;/span&gt;(domain, &lt;span class="hljs-string"&gt;''&lt;/span&gt;), &lt;span class="hljs-attr"&gt;received&lt;/span&gt;: received.&lt;span class="hljs-title function_"&gt;replace&lt;/span&gt;(domain, &lt;span class="hljs-string"&gt;''&lt;/span&gt;), &lt;span class="hljs-attr"&gt;passed&lt;/span&gt;: expected === received };
}); &lt;span class="hljs-variable language_"&gt;console&lt;/span&gt;.&lt;span class="hljs-title function_"&gt;table&lt;/span&gt;(assertions_evaluated);
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</summary></entry><entry><title>Disaggregated Prefilling (experimental)¶</title><link href="https://docs.vllm.ai/en/latest/features/disagg_prefill.html" rel="alternate"></link><published>2025-08-22T21:00:17.167000Z</published><id>https://docs.vllm.ai/en/latest/features/disagg_prefill.html</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/disaggregated-prefil/0:e66eff"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

</summary></entry><entry><title>Modern Node.js Patterns for 2025</title><link href="https://kashw1n.com/blog/nodejs-2025/" rel="alternate"></link><published>2025-08-07T13:10:14.996000Z</published><id>https://kashw1n.com/blog/nodejs-2025/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/modern-nodejs-patter/0:58de92"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="content"&gt; &lt;img alt="Modern Node.js development workflow" src="https://kashw1n.com/static/nodejs-2025.png" width="400"/&gt;
&lt;p&gt;Node.js has undergone a remarkable transformation since its early days. If you’ve been writing Node.js for several years, you’ve likely witnessed this evolution firsthand—from the callback-heavy, CommonJS-dominated landscape to today’s clean, standards-based development experience.&lt;/p&gt;
&lt;p&gt;The changes aren’t just cosmetic; they represent a fundamental shift in how we approach server-side JavaScript development. Modern Node.js embraces web standards, reduces external dependencies, and provides a more intuitive developer experience. Let’s explore these transformations and understand why they matter for your applications in 2025.&lt;/p&gt;
&lt;h2&gt;1. Module System: ESM is the New Standard&lt;/h2&gt;
&lt;p&gt;The module system is perhaps where you’ll notice the biggest difference. CommonJS served us well, but ES Modules (ESM) have become the clear winner, offering better tooling support and alignment with web standards.&lt;/p&gt;
&lt;h3&gt;The Old Way (CommonJS)&lt;/h3&gt;
&lt;p&gt;Let’s look at how we used to structure modules. This approach required explicit exports and synchronous imports:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// math.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;function&lt;/span&gt;&lt;span&gt; add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;a&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;b&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; a&lt;/span&gt;&lt;span&gt; +&lt;/span&gt;&lt;span&gt; b&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;module&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;exports&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt; };&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// app.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; require&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'./math'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This worked fine, but it had limitations—no static analysis, no tree-shaking, and it didn’t align with browser standards.&lt;/p&gt;
&lt;h3&gt;The Modern Way (ES Modules with Node: Prefix)&lt;/h3&gt;
&lt;p&gt;Modern Node.js development embraces ES Modules with a crucial addition—the &lt;code&gt;node:&lt;/code&gt; prefix for built-in modules. This explicit naming prevents confusion and makes dependencies crystal clear:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// math.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;export&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;a&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;b&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; a&lt;/span&gt;&lt;span&gt; +&lt;/span&gt;&lt;span&gt; b&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// app.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; './math.js'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;readFile&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:fs/promises'&lt;/span&gt;&lt;span&gt;; &lt;/span&gt;&lt;span&gt;// Modern node: prefix&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;createServer&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:http'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;node:&lt;/code&gt; prefix is more than just a convention—it’s a clear signal to both developers and tools that you’re importing Node.js built-ins rather than npm packages. This prevents potential conflicts and makes your code more explicit about its dependencies.&lt;/p&gt;
&lt;h3&gt;Top-Level Await: Simplifying Initialization&lt;/h3&gt;
&lt;p&gt;One of the most game-changing features is top-level await. No more wrapping your entire application in an async function just to use await at the module level:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// app.js - Clean initialization without wrapper functions&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;readFile&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:fs/promises'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; config&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; JSON&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;parse&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;await&lt;/span&gt;&lt;span&gt; readFile&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'config.json'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'utf8'&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; server&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; createServer&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;/* ... */&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'App started with config:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;config&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;appName&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This eliminates the common pattern of immediately-invoked async function expressions (IIFE) that we used to see everywhere. Your code becomes more linear and easier to reason about.&lt;/p&gt;
&lt;h2&gt;2. Built-in Web APIs: Reducing External Dependencies&lt;/h2&gt;
&lt;p&gt;Node.js has embraced web standards in a big way, bringing APIs that web developers already know directly into the runtime. This means fewer dependencies and more consistency across environments.&lt;/p&gt;
&lt;h3&gt;Fetch API: No More HTTP Library Dependencies&lt;/h3&gt;
&lt;p&gt;Remember when every project needed axios, node-fetch, or similar libraries for HTTP requests? Those days are over. Node.js now includes the Fetch API natively:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// Old way - external dependencies required&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; axios&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; require&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'axios'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; response&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; axios&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;get&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'https://api.example.com/data'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Modern way - built-in fetch with enhanced features&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; response&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; fetch&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'https://api.example.com/data'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; data&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; response&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;json&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;But the modern approach goes beyond just replacing your HTTP library. You get sophisticated timeout and cancellation support built-in:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; fetchData&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;url&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; response&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; fetch&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;url&lt;/span&gt;&lt;span&gt;, {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; signal&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;AbortSignal&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;5000&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;// Built-in timeout support&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;!&lt;/span&gt;&lt;span&gt;response&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ok&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;`HTTP &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;response&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;status&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;response&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;statusText&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;`&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; response&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;json&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt; ===&lt;/span&gt;&lt;span&gt; 'TimeoutError'&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Request timed out'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This approach eliminates the need for timeout libraries and provides a consistent error handling experience. The &lt;code&gt;AbortSignal.timeout()&lt;/code&gt; method is particularly elegant—it creates a signal that automatically aborts after the specified time.&lt;/p&gt;
&lt;h3&gt;AbortController: Graceful Operation Cancellation&lt;/h3&gt;
&lt;p&gt;Modern applications need to handle cancellation gracefully, whether it’s user-initiated or due to timeouts. AbortController provides a standardized way to cancel operations:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// Cancel long-running operations cleanly&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; controller&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; AbortController&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Set up automatic cancellation&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;setTimeout&lt;/span&gt;&lt;span&gt;(() &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; controller&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;abort&lt;/span&gt;&lt;span&gt;(), &lt;/span&gt;&lt;span&gt;10000&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; data&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; fetch&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'https://slow-api.com/data'&lt;/span&gt;&lt;span&gt;, {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; signal&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;controller&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;signal&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Data received:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;data&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;} &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt; ===&lt;/span&gt;&lt;span&gt; 'AbortError'&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Request was cancelled - this is expected behavior'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;else&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Unexpected error:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pattern works across many Node.js APIs, not just fetch. You can use the same AbortController with file operations, database queries, and any async operation that supports cancellation.&lt;/p&gt;
&lt;h2&gt;3. Built-in Testing: Professional Testing Without External Dependencies&lt;/h2&gt;
&lt;p&gt;Testing used to require choosing between Jest, Mocha, Ava, or other frameworks. Node.js now includes a full-featured test runner that covers most testing needs without any external dependencies.&lt;/p&gt;
&lt;h3&gt;Modern Testing with Node.js Built-in Test Runner&lt;/h3&gt;
&lt;p&gt;The built-in test runner provides a clean, familiar API that feels modern and complete:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// test/math.test.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;test&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;describe&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:test'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; assert&lt;/span&gt;&lt;span&gt; from&lt;/span&gt;&lt;span&gt; 'node:assert'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;multiply&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; '../math.js'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;describe&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Math functions'&lt;/span&gt;&lt;span&gt;, () &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; test&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'adds numbers correctly'&lt;/span&gt;&lt;span&gt;, () &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; assert&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;strictEqual&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;), &lt;/span&gt;&lt;span&gt;5&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; test&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'handles async operations'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; () &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; multiply&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; assert&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;strictEqual&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;6&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; test&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'throws on invalid input'&lt;/span&gt;&lt;span&gt;, () &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; assert&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;throws&lt;/span&gt;&lt;span&gt;(() &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'a'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'b'&lt;/span&gt;&lt;span&gt;),&lt;/span&gt;&lt;span&gt; /Invalid input/&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;});&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;What makes this particularly powerful is how seamlessly it integrates with the Node.js development workflow:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;# Run all tests with built-in runner&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --test&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;# Watch mode for development&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --test&lt;/span&gt;&lt;span&gt; --watch&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;# Coverage reporting (Node.js 20+)&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --test&lt;/span&gt;&lt;span&gt; --experimental-test-coverage&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The watch mode is especially valuable during development—your tests re-run automatically as you modify code, providing immediate feedback without any additional configuration.&lt;/p&gt;
&lt;h2&gt;4. Sophisticated Asynchronous Patterns&lt;/h2&gt;
&lt;p&gt;While async/await isn’t new, the patterns around it have matured significantly. Modern Node.js development leverages these patterns more effectively and combines them with newer APIs.&lt;/p&gt;
&lt;h3&gt;Async/Await with Enhanced Error Handling&lt;/h3&gt;
&lt;p&gt;Modern error handling combines async/await with sophisticated error recovery and parallel execution patterns:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;readFile&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;writeFile&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:fs/promises'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; processData&lt;/span&gt;&lt;span&gt;() {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; // Parallel execution of independent operations&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; [&lt;/span&gt;&lt;span&gt;config&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;userData&lt;/span&gt;&lt;span&gt;] &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; Promise&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;all&lt;/span&gt;&lt;span&gt;([&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; readFile&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'config.json'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'utf8'&lt;/span&gt;&lt;span&gt;),&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; fetch&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'/api/user'&lt;/span&gt;&lt;span&gt;).&lt;/span&gt;&lt;span&gt;then&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;r&lt;/span&gt;&lt;span&gt; =&amp;gt;&lt;/span&gt;&lt;span&gt; r&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;json&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; ]);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; processed&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; processUserData&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;userData&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;JSON&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;parse&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;config&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; writeFile&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'output.json'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;JSON&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;stringify&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;processed&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;null&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; processed&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; // Structured error logging with context&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Processing failed:'&lt;/span&gt;&lt;span&gt;, {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; stack&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;stack&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; timestamp&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; Date&lt;/span&gt;&lt;span&gt;().&lt;/span&gt;&lt;span&gt;toISOString&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pattern combines parallel execution for performance with comprehensive error handling. The &lt;code&gt;Promise.all()&lt;/code&gt; ensures that independent operations run concurrently, while the try/catch provides a single point for error handling with rich context.&lt;/p&gt;
&lt;h3&gt;Modern Event Handling with AsyncIterators&lt;/h3&gt;
&lt;p&gt;Event-driven programming has evolved beyond simple event listeners. AsyncIterators provide a more powerful way to handle streams of events:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;EventEmitter&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;once&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:events'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; DataProcessor&lt;/span&gt;&lt;span&gt; extends&lt;/span&gt;&lt;span&gt; EventEmitter&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; async&lt;/span&gt;&lt;span&gt; *&lt;/span&gt;&lt;span&gt;processStream&lt;/span&gt;&lt;span&gt;() {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; for&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;let&lt;/span&gt;&lt;span&gt; i&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; 0&lt;/span&gt;&lt;span&gt;; &lt;/span&gt;&lt;span&gt;i&lt;/span&gt;&lt;span&gt; &amp;lt;&lt;/span&gt;&lt;span&gt; 10&lt;/span&gt;&lt;span&gt;; &lt;/span&gt;&lt;span&gt;i&lt;/span&gt;&lt;span&gt;++&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;emit&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'data'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;`chunk-&lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;i&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;`&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; yield&lt;/span&gt;&lt;span&gt; `processed-&lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;i&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;`&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; // Simulate async processing time&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Promise&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;resolve&lt;/span&gt;&lt;span&gt; =&amp;gt;&lt;/span&gt;&lt;span&gt; setTimeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;resolve&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;100&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;emit&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'end'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Consume events as an async iterator&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; processor&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; DataProcessor&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;for&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; of&lt;/span&gt;&lt;span&gt; processor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;processStream&lt;/span&gt;&lt;span&gt;()) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Processed:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This approach is particularly powerful because it combines the flexibility of events with the control flow of async iteration. You can process events in sequence, handle backpressure naturally, and break out of processing loops cleanly.&lt;/p&gt;
&lt;h2&gt;5. Advanced Streams with Web Standards Integration&lt;/h2&gt;
&lt;p&gt;Streams remain one of Node.js’s most powerful features, but they’ve evolved to embrace web standards and provide better interoperability.&lt;/p&gt;
&lt;h3&gt;Modern Stream Processing&lt;/h3&gt;
&lt;p&gt;Stream processing has become more intuitive with better APIs and clearer patterns:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;Readable&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Transform&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:stream'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:stream/promises'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;createReadStream&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;createWriteStream&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:fs'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Create transform streams with clean, focused logic&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; upperCaseTransform&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Transform&lt;/span&gt;&lt;span&gt;({&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; objectMode&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;true&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; transform&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;chunk&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;encoding&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;callback&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;push&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;chunk&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toString&lt;/span&gt;&lt;span&gt;().&lt;/span&gt;&lt;span&gt;toUpperCase&lt;/span&gt;&lt;span&gt;());&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; callback&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;});&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Process files with robust error handling&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; processFile&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;inputFile&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;outputFile&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; pipeline&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; createReadStream&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;inputFile&lt;/span&gt;&lt;span&gt;),&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; upperCaseTransform&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; createWriteStream&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;outputFile&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; );&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'File processed successfully'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Pipeline failed:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;pipeline&lt;/code&gt; function with promises provides automatic cleanup and error handling, eliminating many of the traditional pain points with stream processing.&lt;/p&gt;
&lt;h3&gt;Web Streams Interoperability&lt;/h3&gt;
&lt;p&gt;Modern Node.js can seamlessly work with Web Streams, enabling better compatibility with browser code and edge runtime environments:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// Create a Web Stream (compatible with browsers)&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; webReadable&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; ReadableStream&lt;/span&gt;&lt;span&gt;({&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; start&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;controller&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; controller&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;enqueue&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Hello '&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; controller&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;enqueue&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'World!'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; controller&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;close&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;});&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Convert between Web Streams and Node.js streams&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; nodeStream&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; Readable&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fromWeb&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;webReadable&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; backToWeb&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; Readable&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toWeb&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;nodeStream&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This interoperability is crucial for applications that need to run in multiple environments or share code between server and client.&lt;/p&gt;
&lt;h2&gt;6. Worker Threads: True Parallelism for CPU-Intensive Tasks&lt;/h2&gt;
&lt;p&gt;JavaScript’s single-threaded nature isn’t always ideal for CPU-intensive work. Worker threads provide a way to leverage multiple cores effectively while maintaining the simplicity of JavaScript.&lt;/p&gt;
&lt;h3&gt;Background Processing Without Blocking&lt;/h3&gt;
&lt;p&gt;Worker threads are perfect for computationally expensive tasks that would otherwise block the main event loop:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// worker.js - Isolated computation environment&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;parentPort&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;workerData&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:worker_threads'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;function&lt;/span&gt;&lt;span&gt; fibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;n&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;n&lt;/span&gt;&lt;span&gt; &amp;lt;&lt;/span&gt;&lt;span&gt; 2&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; n&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; fibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;n&lt;/span&gt;&lt;span&gt; -&lt;/span&gt;&lt;span&gt; 1&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; fibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;n&lt;/span&gt;&lt;span&gt; -&lt;/span&gt;&lt;span&gt; 2&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; fibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;workerData&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;number&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;parentPort&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;postMessage&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The main application can delegate heavy computations without blocking other operations:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// main.js - Non-blocking delegation&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;Worker&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:worker_threads'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;fileURLToPath&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:url'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; calculateFibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;number&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Promise&lt;/span&gt;&lt;span&gt;((&lt;/span&gt;&lt;span&gt;resolve&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;reject&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; worker&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Worker&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; fileURLToPath&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; URL&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'./worker.js'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;meta&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;url&lt;/span&gt;&lt;span&gt;)),&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;workerData&lt;/span&gt;&lt;span&gt;: { &lt;/span&gt;&lt;span&gt;number&lt;/span&gt;&lt;span&gt; } }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; );&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; worker&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;on&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'message'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;resolve&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; worker&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;on&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'error'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;reject&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; worker&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;on&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'exit'&lt;/span&gt;&lt;span&gt;, (&lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt; !==&lt;/span&gt;&lt;span&gt; 0&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; reject&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; Error&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;`Worker stopped with exit code &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;`&lt;/span&gt;&lt;span&gt;));&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Your main application remains responsive&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Starting calculation...'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; calculateFibonacci&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;40&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Fibonacci result:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Application remained responsive throughout!'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pattern allows your application to utilize multiple CPU cores while keeping the familiar async/await programming model.&lt;/p&gt;
&lt;h2&gt;7. Enhanced Development Experience&lt;/h2&gt;
&lt;p&gt;Modern Node.js prioritizes developer experience with built-in tools that previously required external packages or complex configurations.&lt;/p&gt;
&lt;h3&gt;Watch Mode and Environment Management&lt;/h3&gt;
&lt;p&gt;Development workflow has been significantly streamlined with built-in watch mode and environment file support:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;{&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "name"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"modern-node-app"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "type"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"module"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "engines"&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "node"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"&amp;gt;=20.0.0"&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; },&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "scripts"&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "dev"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"node --watch --env-file=.env app.js"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "test"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"node --test --watch"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "start"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"node app.js"&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;--watch&lt;/code&gt; flag eliminates the need for nodemon, while &lt;code&gt;--env-file&lt;/code&gt; removes the dependency on dotenv. Your development environment becomes simpler and faster:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// .env file automatically loaded with --env-file&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// DATABASE_URL=postgres://localhost:5432/mydb&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// API_KEY=secret123&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// app.js - Environment variables available immediately&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Connecting to:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;process&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;env&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;DATABASE_URL&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'API Key loaded:'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;process&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;env&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;API_KEY&lt;/span&gt;&lt;span&gt; ?&lt;/span&gt;&lt;span&gt; 'Yes'&lt;/span&gt;&lt;span&gt; :&lt;/span&gt;&lt;span&gt; 'No'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These features make development more pleasant by reducing configuration overhead and eliminating restart cycles.&lt;/p&gt;
&lt;h2&gt;8. Modern Security and Performance Monitoring&lt;/h2&gt;
&lt;p&gt;Security and performance have become first-class concerns with built-in tools for monitoring and controlling application behavior.&lt;/p&gt;
&lt;h3&gt;Permission Model for Enhanced Security&lt;/h3&gt;
&lt;p&gt;The experimental permission model allows you to restrict what your application can access, following the principle of least privilege:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;# Run with restricted file system access&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --experimental-permission&lt;/span&gt;&lt;span&gt; --allow-fs-read=./data&lt;/span&gt;&lt;span&gt; --allow-fs-write=./logs&lt;/span&gt;&lt;span&gt; app.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;# Network restrictions &lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --experimental-permission&lt;/span&gt;&lt;span&gt; --allow-net=api.example.com&lt;/span&gt;&lt;span&gt; app.js&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;# Above allow-net feature not avaiable yet, PR merged in node.js repo, will be available in future release&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is particularly valuable for applications that process untrusted code or need to demonstrate security compliance.&lt;/p&gt;
&lt;h3&gt;Built-in Performance Monitoring&lt;/h3&gt;
&lt;p&gt;Performance monitoring is now built into the platform, eliminating the need for external APM tools for basic monitoring:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;PerformanceObserver&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;performance&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; 'node:perf_hooks'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Set up automatic performance monitoring&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; obs&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; PerformanceObserver&lt;/span&gt;&lt;span&gt;((&lt;/span&gt;&lt;span&gt;list&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; for&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; entry&lt;/span&gt;&lt;span&gt; of&lt;/span&gt;&lt;span&gt; list&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getEntries&lt;/span&gt;&lt;span&gt;()) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;entry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;duration&lt;/span&gt;&lt;span&gt; &amp;gt;&lt;/span&gt;&lt;span&gt; 100&lt;/span&gt;&lt;span&gt;) { &lt;/span&gt;&lt;span&gt;// Log slow operations&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;`Slow operation detected: &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;entry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt; took &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;entry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;duration&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;ms`&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;});&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;obs&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;observe&lt;/span&gt;&lt;span&gt;({ &lt;/span&gt;&lt;span&gt;entryTypes&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;'function'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'http'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'dns'&lt;/span&gt;&lt;span&gt;] });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Instrument your own operations&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; processLargeDataset&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;data&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;mark&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'processing-start'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; heavyProcessing&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;data&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;mark&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'processing-end'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;measure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'data-processing'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'processing-start'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;'processing-end'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This provides visibility into application performance without external dependencies, helping you identify bottlenecks early in development.&lt;/p&gt;
&lt;h2&gt;9. Application Distribution and Deployment&lt;/h2&gt;
&lt;p&gt;Modern Node.js makes application distribution simpler with features like single executable applications and improved packaging.&lt;/p&gt;
&lt;h3&gt;Single Executable Applications&lt;/h3&gt;
&lt;p&gt;You can now bundle your Node.js application into a single executable file, simplifying deployment and distribution:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;# Create a self-contained executable&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;node&lt;/span&gt;&lt;span&gt; --experimental-sea-config&lt;/span&gt;&lt;span&gt; sea-config.json&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The configuration file defines how your application gets bundled:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;{&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "main"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"app.js"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "output"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"my-app-bundle.blob"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "disableExperimentalSEAWarning"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;true&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is particularly valuable for CLI tools, desktop applications, or any scenario where you want to distribute your application without requiring users to install Node.js separately.&lt;/p&gt;
&lt;h2&gt;10. Modern Error Handling and Diagnostics&lt;/h2&gt;
&lt;p&gt;Error handling has evolved beyond simple try/catch blocks to include structured error handling and comprehensive diagnostics.&lt;/p&gt;
&lt;h3&gt;Structured Error Handling&lt;/h3&gt;
&lt;p&gt;Modern applications benefit from structured, contextual error handling that provides better debugging information:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; AppError&lt;/span&gt;&lt;span&gt; extends&lt;/span&gt;&lt;span&gt; Error&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; constructor&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;statusCode&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; 500&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;context&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; {}) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; super&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; 'AppError'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; code&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;statusCode&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; statusCode&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;context&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; context&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timestamp&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; Date&lt;/span&gt;&lt;span&gt;().&lt;/span&gt;&lt;span&gt;toISOString&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; toJSON&lt;/span&gt;&lt;span&gt;() {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; name&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; message&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; code&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;code&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; statusCode&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;statusCode&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; context&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;context&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; timestamp&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timestamp&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; stack&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;stack&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; };&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Usage with rich context&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;throw&lt;/span&gt;&lt;span&gt; new&lt;/span&gt;&lt;span&gt; AppError&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; 'Database connection failed'&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; 'DB_CONNECTION_ERROR'&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; 503&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;host&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;'localhost'&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;port&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;5432&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;retryAttempt&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This approach provides much richer error information for debugging and monitoring, while maintaining a consistent error interface across your application.&lt;/p&gt;
&lt;h3&gt;Advanced Diagnostics&lt;/h3&gt;
&lt;p&gt;Node.js includes sophisticated diagnostic capabilities that help you understand what’s happening inside your application:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; diagnostics_channel&lt;/span&gt;&lt;span&gt; from&lt;/span&gt;&lt;span&gt; 'node:diagnostics_channel'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Create custom diagnostic channels&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; dbChannel&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; diagnostics_channel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;channel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'app:database'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;const&lt;/span&gt;&lt;span&gt; httpChannel&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; diagnostics_channel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;channel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'app:http'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Subscribe to diagnostic events&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;dbChannel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;subscribe&lt;/span&gt;&lt;span&gt;((&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;=&amp;gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'Database operation:'&lt;/span&gt;&lt;span&gt;, {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; operation&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;operation&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; duration&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;duration&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; query&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;query&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;});&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Publish diagnostic information&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; queryDatabase&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;sql&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;params&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; start&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;now&lt;/span&gt;&lt;span&gt;();&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; db&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;query&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;sql&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;params&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; dbChannel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;publish&lt;/span&gt;&lt;span&gt;({&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; operation&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;'query'&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; sql&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; params&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; duration&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;now&lt;/span&gt;&lt;span&gt;() &lt;/span&gt;&lt;span&gt;-&lt;/span&gt;&lt;span&gt; start&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; success&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;true&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; result&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; dbChannel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;publish&lt;/span&gt;&lt;span&gt;({&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; operation&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;'query'&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; sql&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; params&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; duration&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;performance&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;now&lt;/span&gt;&lt;span&gt;() &lt;/span&gt;&lt;span&gt;-&lt;/span&gt;&lt;span&gt; start&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; success&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;false&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;message&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; });&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; throw&lt;/span&gt;&lt;span&gt; error&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This diagnostic information can be consumed by monitoring tools, logged for analysis, or used to trigger automatic remediation actions.&lt;/p&gt;
&lt;h2&gt;11. Modern Package Management and Module Resolution&lt;/h2&gt;
&lt;p&gt;Package management and module resolution have become more sophisticated, with better support for monorepos, internal packages, and flexible module resolution.&lt;/p&gt;
&lt;h3&gt;Import Maps and Internal Package Resolution&lt;/h3&gt;
&lt;p&gt;Modern Node.js supports import maps, allowing you to create clean internal module references:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;{&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "imports"&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "#config"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"./src/config/index.js"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "#utils/*"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"./src/utils/*.js"&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; "#db"&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;"./src/database/connection.js"&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This creates a clean, stable interface for internal modules:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// Clean internal imports that don't break when you reorganize&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; config&lt;/span&gt;&lt;span&gt; from&lt;/span&gt;&lt;span&gt; '#config'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; { &lt;/span&gt;&lt;span&gt;logger&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;validator&lt;/span&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt; '#utils/common'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;import&lt;/span&gt;&lt;span&gt; db&lt;/span&gt;&lt;span&gt; from&lt;/span&gt;&lt;span&gt; '#db'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;These internal imports make refactoring easier and provide a clear distinction between internal and external dependencies.&lt;/p&gt;
&lt;h3&gt;Dynamic Imports for Flexible Loading&lt;/h3&gt;
&lt;p&gt;Dynamic imports enable sophisticated loading patterns, including conditional loading and code splitting:&lt;/p&gt;
&lt;pre class="astro-code one-dark-pro"&gt;&lt;code&gt;&lt;span class="line"&gt;&lt;span&gt;// Load features based on configuration or environment&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; loadDatabaseAdapter&lt;/span&gt;&lt;span&gt;() {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; dbType&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; process&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;env&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;DATABASE_TYPE&lt;/span&gt;&lt;span&gt; ||&lt;/span&gt;&lt;span&gt; 'sqlite'&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; adapter&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; import&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;`#db/adapters/&lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;dbType&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt;`&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; adapter&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;default&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; } &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;error&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; console&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;warn&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;`Database adapter &lt;/span&gt;&lt;span&gt;${&lt;/span&gt;&lt;span&gt;dbType&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;span&gt; not available, falling back to sqlite`&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; fallback&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; import&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'#db/adapters/sqlite'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; fallback&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;default&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;// Conditional feature loading&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;async&lt;/span&gt;&lt;span&gt; function&lt;/span&gt;&lt;span&gt; loadOptionalFeatures&lt;/span&gt;&lt;span&gt;() {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; features&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; [];&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;process&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;env&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ENABLE_ANALYTICS&lt;/span&gt;&lt;span&gt; ===&lt;/span&gt;&lt;span&gt; 'true'&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; analytics&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; import&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'#features/analytics'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; features&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;push&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;analytics&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;default&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;process&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;env&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ENABLE_MONITORING&lt;/span&gt;&lt;span&gt; ===&lt;/span&gt;&lt;span&gt; 'true'&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; const&lt;/span&gt;&lt;span&gt; monitoring&lt;/span&gt;&lt;span&gt; =&lt;/span&gt;&lt;span&gt; await&lt;/span&gt;&lt;span&gt; import&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;'#features/monitoring'&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; features&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;push&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;monitoring&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;default&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; }&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt; return&lt;/span&gt;&lt;span&gt; features&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;span&gt;}&lt;/span&gt;&lt;/span&gt;
&lt;span class="line"&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This pattern allows you to build applications that adapt to their environment and only load the code they actually need.&lt;/p&gt;
&lt;h2&gt;The Path Forward: Key Takeaways for Modern Node.js (2025)&lt;/h2&gt;
&lt;p&gt;As we look at the current state of Node.js development, several key principles emerge:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Embrace Web Standards&lt;/strong&gt;: Use &lt;code&gt;node:&lt;/code&gt; prefixes, fetch API, AbortController, and Web Streams for better compatibility and reduced dependencies&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Leverage Built-in Tools&lt;/strong&gt;: The test runner, watch mode, and environment file support reduce external dependencies and configuration complexity&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Think in Modern Async Patterns&lt;/strong&gt;: Top-level await, structured error handling, and async iterators make code more readable and maintainable&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use Worker Threads Strategically&lt;/strong&gt;: For CPU-intensive tasks, worker threads provide true parallelism without blocking the main thread&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Adopt Progressive Enhancement&lt;/strong&gt;: Use permission models, diagnostics channels, and performance monitoring to build robust, observable applications&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Optimize for Developer Experience&lt;/strong&gt;: Watch mode, built-in testing, and import maps create a more pleasant development workflow&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Plan for Distribution&lt;/strong&gt;: Single executable applications and modern packaging make deployment simpler&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The transformation of Node.js from a simple JavaScript runtime to a comprehensive development platform is remarkable. By adopting these modern patterns, you’re not just writing contemporary code—you’re building applications that are more maintainable, performant, and aligned with the broader JavaScript ecosystem.&lt;/p&gt;
&lt;p&gt;The beauty of modern Node.js lies in its evolution while maintaining backward compatibility. You can adopt these patterns incrementally, and they work alongside existing code. Whether you’re starting a new project or modernizing an existing one, these patterns provide a clear path toward more robust, enjoyable Node.js development.&lt;/p&gt;
&lt;p&gt;As we move through 2025, Node.js continues to evolve, but the foundational patterns we’ve explored here provide a solid base for building applications that will remain modern and maintainable for years to come.&lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>L4S and the Future of Real-Time Performance in 5G and Beyond</title><link href="https://blog.3g4g.co.uk/2025/07/l4s-and-future-of-real-time-performance.html" rel="alternate"></link><published>2025-07-24T19:35:15.693000Z</published><id>https://blog.3g4g.co.uk/2025/07/l4s-and-future-of-real-time-performance.html</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/l4s-and-the-future-o/5168105:3b196a"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/5168105.png" style="vertical-align: middle;width:16px;height:16px;"&gt; The 3G4G Blog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="post hentry"&gt;
&lt;a class="external" rel="nofollow"&gt;&lt;/a&gt; &lt;div class="post-body entry-content"&gt;
&lt;p&gt;As mobile networks continue to evolve to support increasingly immersive and responsive services, the importance of consistent low latency has never been greater. Whether it is cloud gaming, extended reality, remote machine operation or real-time collaboration, all these applications rely on the ability to react instantly to user input. The slightest delay can affect the user experience, making the role of the network even more critical.&lt;/p&gt;&lt;div class="separator"&gt;&lt;a class="external" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicNADJfuJYZUEKMUsO3vTmDy4DZCkEyeJcPa3k1c4rwwbeeyfXr7GmVc5fUCdjGBSqp4HMQF2TdotE5qbSY9g9ISmrHq7go3mCorAP3VaqT-nnvdO5CqJOaG5xET9494Ei9OZMyf5AVjGn3Zb07NmGEwJUqrnjBKwyspfwwPeeOvROkOPhWJO430VnC2w/s1920/NokiaBellLabs_L4Swhitepaper_1.jpg" rel="nofollow"&gt;&lt;img alt="" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEicNADJfuJYZUEKMUsO3vTmDy4DZCkEyeJcPa3k1c4rwwbeeyfXr7GmVc5fUCdjGBSqp4HMQF2TdotE5qbSY9g9ISmrHq7go3mCorAP3VaqT-nnvdO5CqJOaG5xET9494Ei9OZMyf5AVjGn3Zb07NmGEwJUqrnjBKwyspfwwPeeOvROkOPhWJO430VnC2w/w640-h360/NokiaBellLabs_L4Swhitepaper_1.jpg" width="640"/&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;While 5G has introduced major improvements in radio latency and overall throughput, many time-critical applications are still affected by a factor that is often overlooked - &lt;a class="external" href="https://www.nokia.com/bell-labs/research/l4s/" rel="nofollow"&gt;queuing delay&lt;/a&gt;. This occurs when packets build up in buffers before they are forwarded, creating spikes in delay and jitter. Traditional methods for congestion control, such as those based on packet loss, are too slow to react, especially in mobile environments where network conditions can change rapidly.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;em&gt;Low Latency, Low Loss and Scalable Throughput (L4S)&lt;/em&gt;&lt;/strong&gt;, is a new network innovation designed to tackle this challenge. It is an Internet protocol mechanism developed through the Internet Engineering Task Force, and has recently reached standardisation. L4S focuses on preventing queuing delays by marking packets early when congestion is building, instead of waiting until buffers overflow and packets are dropped. The key idea is to use explicit signals within the network to guide congestion control at the sender side.&lt;/p&gt;&lt;p&gt;Applications that support L4S are able to reduce their sending rate quickly when congestion starts to appear. This is done by using ECN, or Explicit Congestion Notification, which involves marking rather than dropping packets. The result is a smooth and continuous flow of data, where latency remains low and throughput remains high, even in changing network conditions.&lt;/p&gt;&lt;p&gt;One of the significant benefits of L4S is its ability to support a wide range of real-time services at scale. &lt;a class="external" href="https://www.ericsson.com/en/reports-and-papers/white-papers/enabling-time-critical-applications-over-5g-with-rate-adaptation" rel="nofollow"&gt;Ericsson highlights&lt;/a&gt; how edge-based applications such as cloud gaming, virtual reality and drone control need stable low-latency connections alongside high bitrates. While over-the-top approaches to congestion control may work for general streaming, they struggle in mobile environments. This is due to variability in channel quality and radio access delays, which can cause sudden spikes in latency. L4S provides a faster and more direct way to detect congestion within the radio network, enabling better performance for these time-sensitive applications.&lt;/p&gt;&lt;div class="separator"&gt;&lt;a class="external" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN5pVzannMkH0DWDq_aHEqA9JEMxfGB94Zp9GJwp5UYAD2fo20AFEPc4xZf5Lb5ht6FdWyVnyDtL5C386OMLsNGl7bUBDlCXTSvlZDS3rf5Wcnr5x7gj7CgxnD65vEwuYmdlPr9HD87vzdeXX6vOvg3kqtnddwQb0hLTj0xnLNGJSQZXSsCnrECP3eqnM/s1920/Ericsson_L4SsolutionInRAN.jpg" rel="nofollow"&gt;&lt;img alt="" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgN5pVzannMkH0DWDq_aHEqA9JEMxfGB94Zp9GJwp5UYAD2fo20AFEPc4xZf5Lb5ht6FdWyVnyDtL5C386OMLsNGl7bUBDlCXTSvlZDS3rf5Wcnr5x7gj7CgxnD65vEwuYmdlPr9HD87vzdeXX6vOvg3kqtnddwQb0hLTj0xnLNGJSQZXSsCnrECP3eqnM/w640-h360/Ericsson_L4SsolutionInRAN.jpg" width="640"/&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;To make this possible, mobile networks need to support L4S in a way that keeps its traffic separate from traditional data flows. This involves using dedicated queues for L4S traffic to ensure it is not delayed behind bulk data transfers. In 5G, this is implemented through dedicated quality-of-service flows, allowing network elements to detect and handle L4S traffic differently. For example, if a mobile user is playing a cloud-based game, the network can identify this traffic and place it on an L4S-optimised flow. This avoids interference from other applications, such as file downloads or video streaming.&lt;/p&gt;&lt;p&gt;&lt;a class="external" href="https://www.nokia.com/asset/213410/" rel="nofollow"&gt;Nokia's approach&lt;/a&gt; further explains how L4S enables fair sharing of bandwidth between classic and L4S traffic without compromising performance. A dual-queue system allows both types of traffic to coexist while preserving the low-latency characteristics of L4S. This is especially important in scenarios where both legacy and L4S-capable applications are in use. In simulations and trials, the L4S mechanism has shown the ability to maintain very low delay even when the link experiences sudden reductions in capacity, which is common in mobile and Wi-Fi networks.&lt;/p&gt;&lt;p&gt;One of the important aspects of L4S is that it requires support both from the application side and within the network. On the application side, rate adaptation based on L4S can be implemented within the app itself, often using modern transport protocols such as QUIC or TCP extensions. Many companies, including device makers and platform providers, are already trialling support for this approach.&lt;/p&gt;&lt;p&gt;Within the network, L4S depends on the ability of routers and radio access equipment to read and mark ECN bits correctly. In mobile networks, the radio access network is typically the key bottleneck where marking should take place. This ensures that congestion is detected at the right point in the path, allowing for quicker response and improved performance.&lt;/p&gt;&lt;p&gt;Although L4S is distinct from ultra-reliable low-latency communication, it can complement those use cases where guaranteed service is needed in controlled environments. What makes L4S more versatile is its scalability and suitability for open internet and large-scale public network use. It can work across both fixed and mobile access networks, providing a common framework for interactive services regardless of access technology.&lt;/p&gt;&lt;p&gt;With L4S in place, it becomes possible to offer new kinds of applications that were previously limited by latency constraints. This includes lighter and more wearable XR headsets that can offload processing to the cloud, or port automation systems that rely on remote control of heavy equipment. Even everyday experiences, such as video calls or online gaming, stand to benefit from a more responsive and stable network connection.&lt;/p&gt;&lt;p&gt;Ultimately, L4S offers a practical and forward-looking approach to delivering the consistent low latency needed for the next generation of digital experiences. By creating a tighter feedback loop between the network and the application, and by applying congestion signals in a more intelligent way, L4S helps unlock the full potential of 5G and future networks.&lt;/p&gt;&lt;p&gt;This &lt;a class="external" href="https://www.youtube.com/watch?v=hBlJ7wXtK_o" rel="nofollow"&gt;introductory video&lt;/a&gt; by CableLabs is a good starting point for anyone willing to dig deeper in the topic. This &lt;a class="external" href="https://www.linkedin.com/posts/deanbubley_l4s-wifi-wifi8-activity-7333434605863145472-sslo/" rel="nofollow"&gt;LinkedIn post&lt;/a&gt; by Dean Bubley and the comments are also worth a read.&lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;PS&lt;/strong&gt;: Just noticed that T-Mobile USA have announced earlier this week that they are the first to unlock L4S in wireless . You can read their blog post &lt;a class="external" href="https://www.t-mobile.com/news/network/unlock-l4s-5g-advanced" rel="nofollow"&gt;here&lt;/a&gt; and a promotional video is available in the Tweet below 👇&lt;/em&gt;&lt;/p&gt;&lt;blockquote class="twitter-tweet"&gt;&lt;p&gt;Your apps need more than speed—they need responsiveness.L4S is now live on our 5G Advanced network, delivering lower latency, less lag, and smarter performance for things like XR, video calls, and remote driving.&lt;/p&gt;&lt;p&gt;CTO &lt;a class="external" href="https://twitter.com/JohnSaw?ref_src=twsrc%5Etfw" rel="nofollow"&gt;@JohnSaw&lt;/a&gt; explains: &lt;a class="external" href="https://t.co/4VMI3WbZE2" rel="nofollow"&gt;https://t.co/4VMI3WbZE2&lt;/a&gt; &lt;a class="external" href="https://t.co/mZ60nvDoM4" rel="nofollow"&gt;pic.twitter.com/mZ60nvDoM4&lt;/a&gt;&lt;/p&gt;— T-Mobile Business (@TMobileBusiness) &lt;a class="external" href="https://twitter.com/TMobileBusiness/status/1947318651369271575?ref_src=twsrc%5Etfw" rel="nofollow"&gt;July 21, 2025&lt;/a&gt;&lt;/blockquote&gt; &lt;/div&gt; &lt;/div&gt;</summary></entry><entry><title>NVIDIA Tensor Core Evolution: From Volta To Blackwell</title><link href="https://semianalysis.com/2025/06/23/nvidia-tensor-core-evolution-from-volta-to-blackwell/" rel="alternate"></link><published>2025-06-24T14:45:13.116000Z</published><id>https://semianalysis.com/2025/06/23/nvidia-tensor-core-evolution-from-volta-to-blackwell/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/nvidia-tensor-core-e/8271433:d6741c"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/8271433.png" style="vertical-align: middle;width:16px;height:16px;"&gt; SemiAnalysis.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="entry-content wp-block-post-content has-global-padding is-layout-constrained wp-block-post-content-is-layout-constrained"&gt;
&lt;p&gt;In our&lt;a class="external" href="https://semianalysis.com/2024/12/11/scaling-laws-o1-pro-architecture-reasoning-training-infrastructure-orion-and-claude-3-5-opus-failures/" rel="nofollow"&gt; AI Scaling Laws article from late last year&lt;/a&gt;, we discussed how multiple stacks of AI scaling laws have continued to drive the AI industry forward, enabling greater than Moore’s Law growth in model capabilities as well as a commensurately rapid reduction in unit token costs. These scaling laws are driven by training and inference optimizations and innovations, but advancements in compute capabilities transcending Moore’s Law have also played a critical role.&lt;/p&gt; &lt;p&gt;One this front, in the AI Scaling Laws article, we revisited the decades-long debate around compute scaling, recounting the end of Dennard Scaling in the late 2000s as well as the end of classic Moore’s Law pace cost per transistor declines by the late 2010s. Despite this, compute capabilities have continued to improve at a rapid pace, with the baton being passed to other technologies such as &lt;a class="external" href="https://semianalysis.com/2021/12/15/advanced-packaging-part-1-pad-limited/" rel="nofollow"&gt;advanced packaging&lt;/a&gt;, &lt;a class="external" href="https://semianalysis.com/2025/02/05/iedm2024/" rel="nofollow"&gt;3D stacking&lt;/a&gt;, &lt;a class="external" href="https://semianalysis.com/2023/02/21/the-future-of-the-transistor/" rel="nofollow"&gt;new transistor types&lt;/a&gt; and specialized architectures such as the GPU.&lt;/p&gt;  &lt;p&gt;When it comes to AI and deep learning, GPU compute capabilities have improved at a faster than Moore’s law pace, consistently delivering remarkable “&lt;a class="external" href="https://en.wikipedia.org/wiki/Huang%27s_law" rel="nofollow"&gt;Huang’s Law&lt;/a&gt;” performance improvements year after year. The technology that is at the heart of driving this improvement is the Tensor Core.&lt;/p&gt; &lt;p&gt;Though the Tensor Core is unquestionably the bedrock upon which the foundations of modern AI and machine learning are built, it is not well understood, even by many experienced practitioners in the field. The rapid evolution of GPU architecture and programming models that run on this architecture means that it is increasingly challenging for Machine Learning researchers and scientists to keep up with the latest changes to Tensor Cores and grasp the implications of these changes.&lt;/p&gt;  &lt;p&gt;In this report, we will introduce the core features of the major datacenter GPUs, first explaining important first principles of performance engineering. We will then trace the evolution of Nvidia’s Tensor Core architectures and programming models, highlighting the motivations behind this evolution. Our end goal is to provide a resource for understanding Nvidia’s GPU architecture and offer intuitive insights into their architectural evolution. Only after explaining each architecture can we explain the beauty of the Blackwell tensor core and the new memory hierarchy of it.&lt;/p&gt; &lt;p&gt;It is important that we explain that a solid grasp of computer architecture is a prerequisite for being able to follow many of the explanations and discussions in this article, and this article will provide a brief section about CUDA programming as a refresher rather than explaining foundational concepts of GPU architecture. Instead, we build on the forefront of Tensor Core knowledge, extending understanding of this cutting-edge technology by documenting what is currently tribal knowledge into accessible, structured insight through detailed explanation.&lt;/p&gt; &lt;p&gt;Just as a university will teach 101 courses as well as 4000 level courses, different articles at SemiAnalysis will cater to varying levels of understanding of the subject matter as well as to readers in different vocations and specializations.&lt;/p&gt; &lt;p&gt;We would like to thank our collaborators:&lt;/p&gt; &lt;ul class="wp-block-list"&gt;
&lt;li&gt;&lt;a class="external" href="https://research.colfax-intl.com" rel="nofollow"&gt;Jay Shah&lt;/a&gt;, Colfax Research: Terrific CUTLASS tutorials and numerous meetings meticulously checking the technical details&lt;/li&gt; &lt;li&gt;&lt;a class="external" href="https://benjaminfspector.com/" rel="nofollow"&gt;Ben Spector&lt;/a&gt;, Stanford Hazy Research: Offered great insights into programming model change and writing advice&lt;/li&gt; &lt;li&gt;&lt;a class="external" href="https://tridao.me/" rel="nofollow"&gt;Tri Dao&lt;/a&gt;, Princeton and Together AI: Reviewed drafts and gave detailed feedback&lt;/li&gt; &lt;li&gt;&lt;a class="external" href="https://www.neilmovva.com/about/" rel="nofollow"&gt;Neil Movva&lt;/a&gt;, Together AI: Reviewed drafts and offered insights into GPU kernel writing&lt;/li&gt; &lt;li&gt;&lt;a class="external" href="https://charlesfrye.github.io/about/" rel="nofollow"&gt;Charles Frye&lt;/a&gt;, Modal: Pedagogical GPU Glossary and general review of the draft&lt;/li&gt; &lt;li&gt;&lt;a class="external" href="https://simonguo.tech/" rel="nofollow"&gt;Simon Guo&lt;/a&gt;, Stanford PhD student: Illustrated the cover picture and reviewed the draft&lt;/li&gt; &lt;li&gt;NVIDIA: Shared context around the progression of Tensor Core designs. Teams include: &lt;/li&gt; &lt;li&gt;Many other GPU wizards&lt;/li&gt;
&lt;/ul&gt; &lt;p&gt;SemiAnalysis will be posting exclusive content on &lt;a class="external" href="http://instagram.com/semianalysis" rel="nofollow"&gt;Instagram Reels&lt;/a&gt; and &lt;a class="external" href="https://www.tiktok.com/@semianalysis" rel="nofollow"&gt;TikTok&lt;/a&gt; starting next week. Follow our socials to get the latest insights on the AI and GPU industry.&lt;/p&gt; &lt;p&gt;For a fixed problem size, Amdahl’s Law specifies the maximum speedup you can obtain by parallelizing with more compute resources. Concretely, scaling compute resources only drives down the execution time of the parallel portion, so the performance improvement is bounded by the serial portion. To quantify it, the maximum performance improvement is:&lt;/p&gt;  &lt;p&gt;where S is the parallel work execution time and p is the speedup of the parallelizable work. In an ideal world where the parallel portion is perfectly parallelized, the speedup p can be the number of processing units.&lt;/p&gt; &lt;p&gt;Strong and weak scaling describe the performance improvement of scaling compute resources for different problem setups. Strong scaling refers to scaling compute resources to solve a fixed-size problem, and Amdahl’s Law quantifies the speedup of strong scaling. On the other hand, weak scaling refers to scaling compute resources to solve larger problems at a constant time. For example, processing a 4x larger image in the same time using 4x more compute resources. We recommend &lt;a class="external" href="https://acenet-arc.github.io/ACENET_Summer_School_General/05-performance/index.html" rel="nofollow"&gt;this blog post&lt;/a&gt; for more detailed explanations.&lt;/p&gt;  &lt;p&gt;Strong and weak scaling imply different performance improvements across problem sizes. Strong scaling offers speedup for all problem sizes, while weak scaling only guarantees performance improvement when we use more compute to solve a larger problem.&lt;/p&gt;  &lt;p&gt;Data movement is a sin because in terms of runtime and scaling, computation is cheap and data movement is expensive. Data movement is fundamentally slower because modern DRAM cells operate at tens of nanoseconds, while transistors switch at sub-nanosecond speed. Regarding scaling, while computation speed gains have slowed since the 2000s, &lt;a class="external" href="https://semianalysis.com/2024/09/03/the-memory-wall/" rel="nofollow"&gt;memory speed has improved slower&lt;/a&gt;, creating the &lt;a class="external" href="https://en.wikipedia.org/wiki/Random-access_memory#Memory_wall" rel="nofollow"&gt;memory wall&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;In this section, we introduce the main Nvidia GPU architectures that use Tensor Cores, namely the Tesla V100 GPU, A100 Tensor Core GPU, H100 Tensor Core GPU, as well as the Blackwell GPU. We have also included a pre-Tensor Core section as a refresher for the CUDA programming model. We will briefly go over the major features and changes that are relevant to understanding the Tensor Core, and we defer the details to other sources, which we link in each subsection.&lt;/p&gt; &lt;p&gt;Parallel Thread Execution (PTX) is a virtual instruction set that abstracts over GPU generations. A PTX program describes a &lt;strong&gt;kernel function&lt;/strong&gt; that is executed with a large number of GPU threads, which are executed on the GPU’s hardware execution units, i.e. CUDA cores. &lt;strong&gt;Threads&lt;/strong&gt; are organized as a grid, and a &lt;strong&gt;grid&lt;/strong&gt; consists of cooperative thread arrays (&lt;strong&gt;CTA&lt;/strong&gt;s). PTX threads can access data from multiple state spaces, which are memory storage areas with different characteristics. Specifically, threads have per-thread &lt;strong&gt;registers&lt;/strong&gt;, threads within a CTA have &lt;strong&gt;shared memory&lt;/strong&gt;, and all threads can access &lt;strong&gt;global memory&lt;/strong&gt;. For more information, please read &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#programming-model" rel="nofollow"&gt;this section of the CUDA documentation&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;The GPU architecture is built around an array of streaming multiprocessors (&lt;strong&gt;SM&lt;/strong&gt;s). An SM consists of scalar processing cores, a multithreaded instruction unit, and an on-chip shared memory. An SM maps each thread to a scalar processing core (also known as a CUDA core), and the multithreaded instruction unit manages threads in groups of 32 parallel threads called &lt;strong&gt;warps&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;At instruction issue time, the instruction unit selects a warp and issues an instruction to the threads of the warp. This execution method is called single-instruction, multiple threads (&lt;strong&gt;SIMT&lt;/strong&gt;). Similar to single-instruction, multiple data (&lt;strong&gt;SIMD&lt;/strong&gt;), SIMT controls multiple processing elements with a single instruction, but unlike SIMD, SIMT specifies a single thread behavior instead of vector width. For more information, please read &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#ptx-machine-model" rel="nofollow"&gt;this section of the CUDA documentation&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Streaming Assembler (SASS) is the architecture-specific instruction set that PTX virtualizes over. See the &lt;a class="external" href="https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-reference" rel="nofollow"&gt;CUDA binary utilities documentation&lt;/a&gt; for more information. Unfortunately, SASS is not well documented due to NVIDIA hiding their architecture ISA details from their competitors.&lt;/p&gt; &lt;p&gt;As deep learning became more prominent, the industry noticed that ML workloads were in need of hardware acceleration. Early in 2015, Google deployed TPUv1 for accelerating their internal ML workloads, and in 2017, Nvidia introduced dedicated hardware for matrix math. Although GPUs consume a small amount of energy when issuing instructions (~30pJ) because of their simple hardware pipeline, simple floating point operations like &lt;code&gt; &lt;/code&gt;consume even less energy at only 1.5pJ. This creates a 20x overhead of power needed for instructions vs for the floating point operation itself. As a result, performing a lot of floating point operations for matrix multiplication is power inefficient. To amortize the instruction overhead, we need to use complex instructions that can perform more computation per instruction. To this end, Nvidia designed the &lt;strong&gt;half-precision matrix multiply and accumulate (&lt;code&gt;&lt;/code&gt;) instruction&lt;/strong&gt;, a specialized instruction that performs half-precision matrix multiplication. The corresponding dedicated hardware to execute this instruction is the Tensor Core, introduced in the Tesla V100 GPU of Volta architecture in 2017. The Volta tensor core was added very late into development of the Volta architecture, only a handful of months before tape out, a testament to how fast Nvidia can pivot their architecture.&lt;/p&gt;  &lt;p&gt;Given a matrix, the multiply and accumulate (MMA) instruction computes D = A * B + C:&lt;/p&gt; &lt;ul class="wp-block-list"&gt;
&lt;li&gt;A is an M by K matrix&lt;/li&gt; &lt;li&gt;B is a K by N matrix&lt;/li&gt; &lt;li&gt;C and D are M by N matrices&lt;/li&gt;
&lt;/ul&gt; &lt;p&gt;We denote the matrix shapes as &lt;code&gt;&lt;/code&gt; or MxNxK.&lt;/p&gt; &lt;p&gt;To perform the full computation, we first load matrices A, B, and C from shared memory to thread registers, so that each thread holds fragments of the matrices. Second, we execute the MMA instruction, which reads the matrices from thread registers, performs computation on Tensor Cores, and stores the result to thread registers. Finally, we store the results from thread registers back to shared memory. The full computation is collectively performed by multiple threads, meaning that every step requires a synchronization between the collaborating threads.&lt;/p&gt;  &lt;p&gt;An SM of a Tesla V100 GPU contains 8 Tensor Cores, grouped in partitions of two. Each Tensor Core is capable of computing an equivalent of 4x4x4 matrix multiplication per cycle, which amounts to 1024 FLOPs per cycle per SM.&lt;br/&gt;&lt;/p&gt;  &lt;p&gt;NVIDIA designed PTX instruction mma to target the lower level &lt;code&gt;&lt;/code&gt; instructions. On Volta architecture, an MMA instruction performs an 8x8x4 matrix multiplication, and a quadpair of 8 threads participate in the operation by collectively holding the input and output matrices. Here T0 refers to thread 0, [T0, T1, T2, T3] and [T16, T17, T18, T19] are threadgroups, and the 2 threadgroups form a quadpair.&lt;/p&gt;  &lt;p&gt;In terms of data types, Volta Tensor Cores support FP16 inputs with FP32 accumulation in correspondence with NVIDIA’s &lt;a class="external" href="https://arxiv.org/abs/1710.03740" rel="nofollow"&gt;mixed-precision training&lt;/a&gt; technique. This technique showed it is possible to train models at lower precision without losing model accuracy.&lt;/p&gt; &lt;p&gt;To fully understand the MMA layout, please refer to Citadel’s microbenchmarking paper, &lt;a class="external" href="https://arxiv.org/abs/1804.06826" rel="nofollow"&gt;Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking&lt;/a&gt;. To see the interleaved layout pattern for Volta Tensor Core MMAs, please read the slides &lt;a class="external" href="https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9593-cutensor-high-performance-tensor-operations-in-cuda-v2.pdf" rel="nofollow"&gt;Programming Tensor Cores: Native Tensor Cores with CUTLASS&lt;/a&gt;. Finally, for other information of the Volta architecture, please refer to the whitepaper &lt;a class="external" href="https://images.nvidia.com/content/volta-architecture/pdf/volta-architecture-whitepaper.pdf" rel="nofollow"&gt;NVIDIA Tesla V100 GPU Architecture&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;Turing architecture includes the &lt;strong&gt;2nd generation Tensor Cores&lt;/strong&gt;, an enhanced version of Volta Tensor Cores, adding INT8 and INT4 precision support. Turing Tensor Cores support a new warp-level synchronous MMA, which we will discuss in the next section. Turing Tensor Cores also enabled Deep Learning Super Sampling (DLSS), marking the start of NVIDIA applying deep learning to gaming graphics. Interested readers can refer to NVIDIA’s blog post &lt;a class="external" href="https://developer.nvidia.com/blog/nvidia-turing-architecture-in-depth/" rel="nofollow"&gt;NVIDIA Turing Architecture In-Depth&lt;/a&gt; and the &lt;a class="external" href="https://images.nvidia.com/aem-dam/en-zz/Solutions/design-visualization/technologies/turing-architecture/NVIDIA-Turing-Architecture-Whitepaper.pdf" rel="nofollow"&gt;Turing architecture whitepaper&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;With Ampere, NVIDIA introduced asynchronous data copy, a way of copying data directly from global memory to shared memory in an asynchronous fashion. To load data from global memory to shared memory on Volta, threads must first load data from global memory to registers, and then store it to shared memory. However, MMA instructions have high register usage and must share the register file with data-loading operations, causing high register pressure and wasting memory bandwidth for copying data in and out of RF.&lt;/p&gt; &lt;p&gt;Async data copy mitigates this issue by fetching data from global memory (DRAM) and directly storing it into shared memory (with optional L1 access), freeing up more registers for MMA instructions. Data loading and compute can happen asynchronously which is more difficult from a programming model perspective but unlocks higher performance.&lt;/p&gt; &lt;p&gt;This feature is implemented as PTX instruction thread-level async copy cp.async (&lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#data-movement-and-conversion-instructions-non-bulk-copy" rel="nofollow"&gt;documentation&lt;/a&gt;). The corresponding SASS is LDGSTS, asynchronous global to shared memory copy. The exact synchronization methods are async-group and mbarrier-based completion mechanisms, detailed &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#data-movement-and-conversion-instructions-asynchronous-copy-completion-mechanisms" rel="nofollow"&gt;here&lt;/a&gt;.&lt;/p&gt;  &lt;p&gt;Ampere has 4 Tensor Cores per SM, and each Tensor Core is capable of performing 512 FLOPs per cycle, amounting to 2048 Dense FLOPs per cycle per SM, doubling the performance of Volta.&lt;/p&gt; &lt;p&gt;While Volta requires a quadpair of 8 threads to participate in an MMA operation, Ampere requires a full warp of 32 threads. Having MMA instructions warp-wide simplifies the thread layout &amp;amp; reducing RF pressure for Ampere. For instance, here is the thread and data layout for mixed-precision floating point of shape 16x8x16:&lt;/p&gt;  &lt;p&gt;NVIDIA introduced &lt;code&gt;&lt;/code&gt; in Ampere, an enhanced vectorized load operation. Like &lt;code&gt;&lt;/code&gt;, &lt;code&gt;&lt;/code&gt; is warp-wide, meaning that a warp of threads collectively loads a matrix. Compared to issuing multiple load instructions, this reduces address generation register use, lowering register pressure. See &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-ldmatrix" rel="nofollow"&gt;the CUDA documentation&lt;/a&gt; for more information.&lt;/p&gt; &lt;p&gt;&lt;code&gt;&lt;/code&gt; loads data to registers in a layout that matches Tensor Core’s data layout. Compared to Volta’s interleaved pattern (See &lt;a class="external" href="https://developer.download.nvidia.com/video/gputechconf/gtc/2019/presentation/s9593-cutensor-high-performance-tensor-operations-in-cuda-v2.pdf" rel="nofollow"&gt;Programming Tensor Cores: Native Tensor Cores with CUTLASS&lt;/a&gt;), a simpler thread and data layout greatly improves the programming ergonomics. Watch the GTC talk &lt;a class="external" href="https://www.nvidia.com/en-us/on-demand/session/gtcsj20-s21745/" rel="nofollow"&gt;Developing CUDA Kernels to Push Tensor Cores to the Absolute Limit on NVIDIA A100&lt;/a&gt; to learn more about how exactly Ampere’s memory loading is coherent with Tensor Core.&lt;/p&gt; &lt;p&gt;Ampere MMA features Brain Floating Point Format (BF16), which has become the de facto standard for half-precision data types. BF16 provides the same 8-bit exponent range as FP32 but with a 7-bit mantissa, allowing FP32-level dynamic range at half the storage cost. BF16 also removes the need for loss scaling in mixed-precision training.&lt;/p&gt; &lt;p&gt;As the number of SMs grew, the size disparity between an SM and the whole GPU increased. To offer a finer granularity of control between CTAs (map to SMs) and the grid (maps to the whole GPU), on Hopper, NVIDIA added a new thread hierarchy level, &lt;strong&gt;thread block cluster&lt;/strong&gt;, which maps to a group of SMs physically located in the same graphics processing cluster (GPC). Thread block cluster is also called cooperative grid array (CGA) and referred to as cluster in the CUDA documentation (&lt;a class="external" href="https://stackoverflow.com/questions/78510678/whats-cga-in-cuda-programming-model" rel="nofollow"&gt;See here for more information&lt;/a&gt;).&lt;/p&gt; &lt;p&gt;CTAs in a thread block cluster are guaranteed to be co-scheduled on SMs in the same GPC and distributed one CTA per SM by default. The shared memory partitions of those SMs form a &lt;strong&gt;distributed shared memory (DSMEM)&lt;/strong&gt;. A thread can access the shared memory from another SM with low latency through the dedicated SM-to-SM network (without going through L2 cache). By exposing the GPC hardware execution unit to the programming model, programmers can reduce data movement and improve the data locality.&lt;/p&gt;  &lt;p&gt;To improve data fetch efficiency, NVIDIA added the Tensor Memory Accelerator (TMA) to each Hopper SM. TMA is a dedicated hardware unit that accelerates asynchronous data transfers of large quantities between global and shared memory (bulk asynchronous copies). &lt;/p&gt; &lt;p&gt;A single thread in a CTA can initiate a TMA copy operation. TMA frees up threads to execute other independent work, handling address generation and offering additional benefits such as out-of-bounds handling. In PTX, the corresponding instruction is &lt;code&gt;&lt;/code&gt;, detailed in &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#data-movement-and-conversion-instructions-bulk-copy" rel="nofollow"&gt;this CUDA documentation section&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;However, for small requests, TMA loads have higher latency than regular async data copies because of the address generation overhead. Thus, NVIDIA recommends programmers to use TMAs for large data copies to amortize the overhead. For example, in LLM inference, TMA is not suitable for workloads that load KV cache in small chunks, but works well when each chunk is a multiple of 16 bytes. For more concrete examples of this, see &lt;a class="external" href="https://lmsys.org/blog/2024-01-17-sglang/" rel="nofollow"&gt;SGLang prefix caching&lt;/a&gt;, paper &lt;a class="external" href="https://arxiv.org/abs/2501.01005" rel="nofollow"&gt;FlashInfer&lt;/a&gt; section 3.2.1, paper &lt;a class="external" href="https://arxiv.org/abs/2505.21487v1" rel="nofollow"&gt;Hardware-Efficient Attention for Fast Decoding&lt;/a&gt; section 4.2, and &lt;a class="external" href="https://github.com/HazyResearch/ThunderKittens/blob/mla/kernels/attn/demo/mla_decode/template_mla_decode.cu#L117" rel="nofollow"&gt;ThunderKittens MLA decode&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;TMA also supports a mode of loading data called multicast, where TMA loads data from global memory to shared memory of multiple SMs in a thread block cluster, specified by a multicast mask. Instead of issuing multiple global memory loads loading the same piece of data into multiple SMs, multicast completes it in one load. Specifically, multiple CTAs in a thread block cluster load a portion of the data into their corresponding SMEMs and share the data through DSMEM. This reduces L2 cache traffic and subsequently reduces HBM traffic. We recommend reading &lt;a class="external" href="https://research.colfax-intl.com/tutorial-hopper-tma/" rel="nofollow"&gt;Jay Shah’s TMA tutorial&lt;/a&gt; for more details.&lt;/p&gt;  &lt;p&gt;NVIDIA introduced a new type of MMA with Hopper, warpgroup-level MMA (&lt;code&gt;&lt;/code&gt;). &lt;code&gt;&lt;/code&gt; is warpgroup-wide, meaning that a warpgroup of 4 warps collectively performs an MMA operation. &lt;code&gt;&lt;/code&gt; supports a wider range of shapes. For example, mixed-precision MMA supports &lt;code&gt;&lt;/code&gt;, where N can be multiples of 8 from 8 to 256. &lt;code&gt;&lt;/code&gt; lowers to a new set of SASS: &lt;code&gt;&lt;/code&gt;. In another example, half-precision &lt;code&gt;&lt;/code&gt; instructions lowers to . See &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#asynchronous-warpgroup-level-matrix-shape" rel="nofollow"&gt;this CUDA documentation section&lt;/a&gt; for the details of MMA shapes and data types.&lt;/p&gt; &lt;p&gt;While all threads in a warpgroup collectively hold the output matrix in their registers, Hopper Tensor Cores can directly load operands from shared memory instead of registers, saving register space and bandwidth. Specifically, operand matrix A can reside in either registers or shared memory, while operand matrix B can only be accessed through shared memory. See the &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#asynchronous-warpgroup-level-matrix-instructions" rel="nofollow"&gt;CUDA documentation wgmma section&lt;/a&gt; for the details of ’s completion mechanism, SMEM layout, and more.&lt;/p&gt;  &lt;p&gt;For &lt;code&gt;&lt;/code&gt; data types, Hopper introduced 8-bit floating-point data types (E4M3 and E5M2) with FP32 accumulation. In practice,&lt;a class="external" href="https://arxiv.org/abs/2412.19437" rel="nofollow"&gt; the accumulation path was implemented as a 22-bit fixed-point format (13-bit mantissa plus sign and exponent bits),&lt;/a&gt; limiting the dynamic range compared to true 32-bit accumulation. Due to the reduced tensor core precision, every N_c accumulations has to happen in the CUDA core to prevent constraining training accuracy. (&lt;a class="external" href="https://arxiv.org/abs/2412.19437" rel="nofollow"&gt;See this paper section 3.3.2&lt;/a&gt;). This reduced precision accumulation improves efficiency, but comes at the cost of accuracy.&lt;/p&gt; &lt;p&gt;For more information on the Hopper Architecture, see the following:&lt;/p&gt; &lt;p&gt;For examples of how to program Hopper GPUs, see:&lt;/p&gt; &lt;p&gt;The extreme register pressure did not let up on Hopper, which motivated &lt;strong&gt;Tensor Memory (TMEM)&lt;/strong&gt;, a new piece of memory specialized for Tensor Core operations. On every SM, TMEM has 128 rows (lanes) and 512 columns of 4-byte cells, totaling to 256 KB, which is also the size of the register file on an SM.&lt;/p&gt; &lt;p&gt;TMEM has a restricted memory access pattern. Specifically, it takes a warpgroup to access the whole TMEM, and each warp in a warpgroup can only access a specific set of lanes. By limiting the memory access pattern, hardware designers can reduce the number of access ports, saving chip space. On the other hand, this design also means that epilogue operations need a warpgroup to operate. Unlike shared memory, programmers have to explicitly manage TMEM, including allocation, deallocation, and copying data in and out of TMEM.&lt;/p&gt;  &lt;p&gt;Two CTAs in a thread block cluster form a &lt;strong&gt;CTA pair&lt;/strong&gt; if their CTA ranks in their thread block cluster differ by the last bit, e.g. 0 and 1, 4 and 5. A CTA pair maps to a Texture Processing Cluster (TPC), which consists of two SMs and combines with other TPCs to form a GPC. When Blackwell Tensor Core operations perform at a CTA pair granularity, the two CTAs are able to share input operands. This sharing reduces both SMEM capacity and bandwidth requirements.&lt;/p&gt; &lt;p&gt;Tensor Core 5th Generation MMA instruction (&lt;code&gt;&lt;/code&gt; in PTX) fully moved away from using registers for holding matrices. Operands now reside in shared memory and Tensor Memory. &lt;/p&gt; &lt;p&gt;Specifically, suppose the MMA computes D = A * B + D: Not using thread registers removes the complex data layouts and frees up thread register space for other work such as epilogue operations. Unlike &lt;code&gt;&lt;/code&gt; using a warpgroup to initiate an MMA operation, &lt;code&gt;&lt;/code&gt; has single thread semantics, meaning that a single thread initiates an MMA operation. This removes the role of warps from issuing MMA.&lt;/p&gt;  &lt;p&gt;One notable MMA variant is MMA.2SM, which uses 2 SMs to collectively perform an MMA operation. MMA.2SM executes at the CTA-pair level granularity, and since &lt;code&gt;&lt;/code&gt; has single thread semantics, a single thread in the leader CTA of the CTA pair launches MMA.2SM. Here we illustrate data path organization &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#tcgen05-data-path-layout-a" rel="nofollow"&gt;layout A&lt;/a&gt;. Layout A shows MMA.2SM doubles the M dimension compared to the 1SM version (&lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#tcgen05-data-path-layout-d" rel="nofollow"&gt;layout D&lt;/a&gt;), so the two SMs load different matrix A and D tiles. In addition, MMA.2SM splits matrix B, halving the amount of data loaded.&lt;/p&gt;  &lt;p&gt;Matrix B is shared across the two SMs, meaning tiles B0 and B1 need to be communicated across the DSMEM. Although there is a bandwidth difference between DSMEM and SMEM, the effects on the coordination are minimal because we are loading smaller tiles. That said, we suspect that on Blackwell the communication bandwidth between SMs in a TPC is higher than DSMEM’s, so MMA.2SM leverages this to achieve better performance.&lt;/p&gt; &lt;p&gt;5th-gen Tensor Cores can also perform convolutions in addition to general matrix multiplication. &lt;code&gt;&lt;/code&gt; supports weight stationary patterns with a collector buffer, which caches matrix B for reuse. For more information, please refer to the &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tcgen05-mma" rel="nofollow"&gt;CUDA documentation&lt;/a&gt; and the corresponding &lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#tcgen05-mma-instructions-mma-ws" rel="nofollow"&gt;weight stationary MMA instruction&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;In terms of supported data types, Blackwell supports microscaling floating-point format (MXFP), including MXFP8, MXFP6, and MXFP4. See &lt;a class="external" href="https://arxiv.org/abs/2310.10537" rel="nofollow"&gt;this paper&lt;/a&gt; for details. Blackwell also supports NVIDIA’s own NVFP4 format, which is known for being more accurate than MXFP4. This is likely because of its smaller block size, different scaling factor data format, and the two-level quantization method (See &lt;a class="external" href="https://github.com/NVIDIA/TensorRT-LLM/issues/3037" rel="nofollow"&gt;this GitHub issue&lt;/a&gt;). See &lt;a class="external" href="https://arxiv.org/abs/2505.19115" rel="nofollow"&gt;this paper&lt;/a&gt; for data format comparisons.&lt;/p&gt; &lt;p&gt;With Blackwell, since FP8 and FP6 have the same theoretical throughput, we believe that they share physical circuits in Tensor Cores. In contrast, CDNA4 has 2x the FP6 throughput compared to FP8 because their FP6 units share data paths with FP4 instead. We believe that UDNA will switch to having FP6 units share with FP8 instead.&lt;/p&gt; &lt;p&gt;Ampere featured 2:4 structured sparsity, which in theory doubled the Tensor Core throughput. It achieves this by pruning the weight matrix such that for every 4 elements, 2 of them are zero. In this format, the matrix is compressed by removing zero elements, and an additional metadata index matrix records their positions, roughly halving the memory usage and bandwidth.&lt;/p&gt; &lt;p&gt;According to &lt;a class="external" href="https://arxiv.org/abs/2501.12084" rel="nofollow"&gt;this microbenchmarking paper from cracked chinese engineers&lt;/a&gt;, Ampere’s structured sparsity can realize 2x speedup for large shape MMA operations at the instruction level. It also shows that in Hopper, structured sparsity &lt;code&gt;&lt;/code&gt; instructions can reach 2x speedup and save up to 2x on memory bandwidth used to load weights.&lt;/p&gt; &lt;p&gt;Unfortunately, 2:4 structured sparsity GEMMs kernels are unable to reach anywhere close to 2x speedup compared to their dense counterparts on hopper. This is due to difficulties in doing structured pruning while maintaining model accuracy, cuSPARSELt kernels being unoptimized, and TDP limitations. Except for Chinese AI labs and a limited number of experimental western &lt;a class="external" href="https://arxiv.org/abs/2503.16672" rel="nofollow"&gt;research&lt;/a&gt; &lt;a class="external" href="https://developers.redhat.com/articles/2024/12/18/24-sparse-llama-fp8-sota-performance-nvidia-hopper-gpus" rel="nofollow"&gt;papers&lt;/a&gt;, most AI labs ignore 2:4 structured sparsity for production inferencing and focus on quantization &amp;amp; distillation. Meta is experimenting with it in Llama, but that is a dead end path in many cases as well.&lt;/p&gt; &lt;p&gt;Furthermore, there is a lack of closed or open models that have shown performance improvements with 2:4 FP8 structured sparsity or 4:8 FP4 structured sparsity while maintaining zero accuracy loss &amp;amp; a &lt;a class="external" href="https://github.com/NVIDIA/TensorRT-Model-Optimizer/blame/main/modelopt/torch/sparsity/sparsegpt.py" rel="nofollow"&gt;general lack of resources dedicated&lt;/a&gt; to structured pruning. We recommend that NVIDIA should stop with &lt;a class="external" href="https://semianalysis.com/2025/03/19/nvidia-gtc-2025-built-for-reasoning-vera-rubin-kyber-cpo-dynamo-inference-jensen-math-feynman/#jensen-math-changes-every-year" rel="nofollow"&gt;Jensen math&lt;/a&gt; structured sparsity flops in keynotes &amp;amp; marketing material unless they start consistently showing SOTA open models being able to take advantage of structured pruning for inferencing. A good first step would be to do structured sparsity on DeepSeek and also show that performance can stack on top of other techniques like distillation &amp;amp; quantization like NVFP4.&lt;/p&gt;  &lt;p&gt;In its fifth‑generation Tensor Cores, NVIDIA introduced pair‑wise 4 : 8 structured sparsity for the NVFP4 data type. In this scheme, every eight elements are grouped into four consecutive pairs, and exactly two of those pairs must contain non‑zero values while the remaining two are pruned to zero. Because NVFP4 is a sub‑byte data type, we believe this constraint motivated NVIDIA to adopt the pair‑wise 4 : 8 pattern. Although 4 : 8 sparsity may appear more permissive than the earlier 2 : 4 pattern, the added pair‑wise requirement means it is not, in practice, a more relaxed constraint for ML engineers seeking to preserve model accuracy while pruning.&lt;/p&gt;   &lt;p&gt;Over generations, NVIDIA scaled the Tensor Core size more aggressively than the number of Tensor Cores. NVIDIA chose scaling the tensor core size rather than number of cores because it suits the performance characteristics of matrix multiplication better. Specifically, when scaling the problem size, matrix multiplication computation grows cubically, but data movement grows quadratically, meaning the arithmetic intensity grows linearly. O(n) arithmetic intensity, combined with the fact that data movement is more expensive than computation, incentivized the tensor core size increase.&lt;/p&gt;  &lt;p&gt;However, both scaling core size and number of cores come at the cost of the quantization effects. Specifically, having a large number of cores suffer from the &lt;a class="external" href="https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#tile-quant" rel="nofollow"&gt;tile quantization effect&lt;/a&gt;, and having a large core size leads to &lt;a class="external" href="https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html#wave-quant" rel="nofollow"&gt;wave quantization effect&lt;/a&gt;. The wave quantization effect occurs when the number of work units isn’t fully divisible by the number of workers, causing utilization to drop when processing the final, smaller batch of work. Increasing tensor core size is essentially increasing the work unit size, resulting in low utilization for small matrices (See this &lt;a class="external" href="https://hazyresearch.stanford.edu/blog/2025-03-15-tk-blackwell" rel="nofollow"&gt;ThunderKittens blog post&lt;/a&gt;).&lt;/p&gt;  &lt;p&gt;The linear growth in arithmetic intensity also motivates the increase in MMA shape. Having larger MMA shapes enhances the operand sharing granularity. Specifically, launching fewer larger tiles would increase the data reuse, saving memory footprint and bandwidth of RF and SMEM. For architectures before Blackwell, this led to increasing the number of threads to collectively perform an MMA operation, from a quadpair of 8 threads (Volta), to a warp of 32 threads (Ampere), and then a warpgroup of 128 threads (Hopper).&lt;/p&gt;  &lt;p&gt;Shared memory increased almost every generation, while register file size stayed constant. The reason for this is that Tensor Core throughput increase requires a deeper staging buffer.&lt;/p&gt; &lt;p&gt;Because Tensor Cores consume data much faster than global memory can load, we use a staging memory to buffer data, so memory loading can run ahead of MMA operations. &lt;strong&gt;Tensor Core throughput doubled every generation, but global memory load latency didn’t decrease and in fact increased. As a result, we need to increase the staging memory size for buffering more data.&lt;/strong&gt; To implement this, NVIDIA chose shared memory as the staging memory for Tensor Cores, which explains why shared memory increased but register file size remained constant.&lt;/p&gt; &lt;p&gt;However, Blackwell’s shared memory size didn’t increase from Hopper. This is because tcgen05 MMA can leverage 2 SMs, so each SM’s shared memory only needs to load half of the operands. Thus, Blackwell’s shared memory size effectively doubled.&lt;/p&gt; &lt;p&gt;NVIDIA’s staging memory choice also explains why operand locations gradually moved away from registers to shared memory. That said, NVIDIA added TMEM on Blackwell to support the increased Tensor Core throughput. Since TMEM is placed closer to Tensor Cores, it can be more power efficient. In addition, having a separate memory increases the aggregate memory bandwidth for saturating the Tensor Cores.&lt;/p&gt; &lt;p&gt;Among all operands, matrix D always stays in TMEM. We can take advantage of TMEM’s power efficiency with this design because matrix D is more frequently accessed than matrix A and B. For example, to compute a tile in a naive tiled matrix multiplication, matrix D tile is accessed 2Kt times (Kt reads and Kt writes. Kt: The number of tiles along the K dimension), whereas matrix A tiles and matrix B tiles are accessed only once.&lt;/p&gt;   &lt;p&gt;The “H” in &lt;code&gt;&lt;/code&gt; stands for half precision since it is a 16 bit format while “Q” in &lt;code&gt;&lt;/code&gt; stands for quarter precision (8 bit) since 8 bits is a quarter of a full precision (32 bits). “O” stands for “Octal” which means one eighth of 32 bits as &lt;code&gt;&lt;/code&gt; is FP4.&lt;/p&gt; &lt;p&gt;MMA instructions seemingly jumped from synchronous to asynchronous. In reality, MMA instructions gradually became asynchronous at the SASS level because of the need to overlap  instructions.&lt;/p&gt; &lt;p&gt;At SASS level, an MMA operation involves executing one &lt;code&gt;&lt;/code&gt; instruction to load matrix tiles from shared memory to the register file, and then two  instructions to perform MMA. During execution, the two  instructions are issued asynchronously, and block the register usage with hardware interlocks. Since hardware interlocks disallows overlapping LDSM instructions, sequential execution of one &lt;code&gt;&lt;/code&gt; and two  instructions creates a small bubble in the instruction issue pipeline. However, Tensor Cores have become so fast that this bubble causes non-negligible amount of performance loss, which calls for an asynchronous completion mechanism for MMA.&lt;/p&gt; &lt;p&gt;Hopper supports asynchronous completion mechanism commit and fence for &lt;code&gt;&lt;/code&gt;. When &lt;code&gt;&lt;/code&gt; instructions are issued, there are no hardware interlocks to guard register usage. Instead, the compiler schedules &lt;code&gt;&lt;/code&gt; for the next MMA and uses  instruction to keep the next &lt;code&gt;&lt;/code&gt; waiting. With Blackwell, the MMA operation is fully asynchronous. Instructions for loading into Tensor Memory (&lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#tcgen05-memory-consistency-model-async-operations" rel="nofollow"&gt;tcgen05.ld / &lt;/a&gt;&lt;a class="external" href="http://tcgen05.st" rel="nofollow"&gt;tcgen05.st&lt;/a&gt;&lt;a class="external" href="https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=tcgen05%2520cp#tcgen05-memory-consistency-model-async-operations" rel="nofollow"&gt; / tcgen05.cp&lt;/a&gt;) are all explicitly asynchronous.&lt;/p&gt;   &lt;p&gt;Throughout each successive generation of NVIDIA Tensor Cores, NVIDIA continues to add lower precision data types, starting from 16-bit to 4-bits. This is because deep learning workloads are extremely tolerant of low precision. This is especially true for inference, where even lower precision can be used than during training. Low precision is more power efficient, takes up less silicon floor space and achieves higher compute throughput. In newer generations, we also see NVIDIA removing FP64 support to prioritize low precision data types under silicon area and power budgets.&lt;/p&gt; &lt;p&gt;Interestingly, the prioritization also affected integer data type support. Since Hopper, INT4 data types are deprecated, and on Blackwell Ultra, we see lower INT8 compute throughput. This is caused by the delayed popularity of low-precision integer data types. Although Turing supported INT8 and INT4, it wasn’t until 4 years later that new inference quantization methods were able to exploit the compactness of INT4 for serving LLMs. By that time, NVIDIA had already deprecated INT4 on Hopper &lt;code&gt;&lt;/code&gt;.&lt;/p&gt; &lt;p&gt;Next, we will talk about how the programming model evolved, including the transition from high-occupancy to single-occupancy, the increase in explicit asynchronous execution, and how those designs relate to NVIDIA betting on strong scaling.&lt;/p&gt; &lt;div class="wp-block-passport-restricted-content"&gt; &lt;/div&gt; &lt;p&gt;If readers like to learn the basics of CUDA programming model, hardware, and concepts, &lt;a class="external" href="https://modal.com/gpu-glossary" rel="nofollow"&gt;GPU Glossary by Modal&lt;/a&gt; is a great resource for everything before Blackwell. To understand the big ideas of CUDA, we recommend all of Stephen Jones’ GTC talks (&lt;a class="external" href="https://www.nvidia.com/en-us/on-demand/search/?facet.mimetype[]=event%20session&amp;amp;layout=list&amp;amp;page=1&amp;amp;q=%22Stephen%20Jones%20%28SW%29%22&amp;amp;sort=relevance&amp;amp;sortDir=desc" rel="nofollow"&gt;playlist here&lt;/a&gt;). To get a deeper understanding of the memory features, GTC talk &lt;a class="external" href="https://www.nvidia.com/en-us/on-demand/session/gtc25-s72683/" rel="nofollow"&gt;CUDA Techniques to Maximize Memory Bandwidth and Hide Latency&lt;/a&gt; explains the memory features of Volta, Ampere, and Hopper, and &lt;a class="external" href="https://www.nvidia.com/en-us/on-demand/session/gtc24-s62192/" rel="nofollow"&gt;Advanced Performance Optimization in CUDA&lt;/a&gt; dives deep into memory models. Finally, for Blackwell-specific resources, we recommend GTC talk &lt;a class="external" href="https://www.nvidia.com/en-us/on-demand/session/gtc25-s72720/" rel="nofollow"&gt;Programming Blackwell Tensor Cores with CUTLASS&lt;/a&gt;, Colfax research CUTLASS articles (&lt;a class="external" href="https://research.colfax-intl.com/cutlass-tutorial-writing-gemm-kernels-using-tensor-memory-for-nvidia-blackwell-gpus/" rel="nofollow"&gt;latest one here&lt;/a&gt;), and the CUTLASS kernel examples.&lt;/p&gt;
&lt;/div&gt;</summary></entry><entry><title>Anthropic: How we built our multi-agent research system</title><link href="https://simonwillison.net/2025/Jun/14/multi-agent-research-system/#atom-everything" rel="alternate"></link><published>2025-06-24T13:15:12.159000Z</published><id>https://simonwillison.net/2025/Jun/14/multi-agent-research-system/#atom-everything</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/anthropic-how-we-bui/790:237301"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/790.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Simon Willison&amp;#x27;s Weblog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;div class="entry entryPage"&gt; &lt;p&gt;&lt;strong&gt;&lt;a class="external" href="https://www.anthropic.com/engineering/built-multi-agent-research-system" rel="nofollow"&gt;Anthropic: How we built our multi-agent research system&lt;/a&gt;&lt;/strong&gt;. OK, I'm sold on multi-agent LLM systems now.&lt;/p&gt;
&lt;p&gt;I've been pretty skeptical of these until recently: why make your life more complicated by running multiple different prompts in parallel when you can usually get something useful done with a single, carefully-crafted prompt against a frontier model?&lt;/p&gt;
&lt;p&gt;This detailed description from Anthropic about how they engineered their "Claude Research" tool has cured me of that skepticism.&lt;/p&gt;
&lt;p&gt;&lt;a class="external" href="https://simonwillison.net/2025/Jun/2/claude-trace/" rel="nofollow"&gt;Reverse engineering Claude Code&lt;/a&gt; had already shown me a mechanism where certain coding research tasks were passed off to a "sub-agent" using a tool call. This new article describes a more sophisticated approach.&lt;/p&gt;
&lt;p&gt;They start strong by providing a clear definition of how they'll be using the term "agent" - it's the "tools in a loop" variant:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A multi-agent system consists of multiple agents (LLMs autonomously using tools in a loop) working together. Our Research feature involves an agent that plans a research process based on user queries, and then uses tools to create parallel agents that search for information simultaneously.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Why use multiple agents for a research system?&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The essence of search is compression: distilling insights from a vast corpus. Subagents facilitate compression by operating in parallel with their own context windows, exploring different aspects of the question simultaneously before condensing the most important tokens for the lead research agent. [...]&lt;/p&gt;
&lt;p&gt;Our internal evaluations show that multi-agent research systems excel especially for breadth-first queries that involve pursuing multiple independent directions simultaneously. We found that a multi-agent system with Claude Opus 4 as the lead agent and Claude Sonnet 4 subagents outperformed single-agent Claude Opus 4 by 90.2% on our internal research eval. For example, when asked to identify all the board members of the companies in the Information Technology S&amp;amp;P 500, the multi-agent system found the correct answers by decomposing this into tasks for subagents, while the single agent system failed to find the answer with slow, sequential searches.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;As anyone who has spent time with Claude Code will already have noticed, the downside of this architecture is that it can burn &lt;em&gt;a lot&lt;/em&gt; more tokens:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;There is a downside: in practice, these architectures burn through tokens fast. In our data, agents typically use about 4× more tokens than chat interactions, and multi-agent systems use about 15× more tokens than chats. For economic viability, multi-agent systems require tasks where the value of the task is high enough to pay for the increased performance. [...]&lt;/p&gt;
&lt;p&gt;We’ve found that multi-agent systems excel at valuable tasks that involve heavy parallelization, information that exceeds single context windows, and interfacing with numerous complex tools.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The key benefit is all about managing that 200,000 token context limit. Each sub-task has its own separate context, allowing much larger volumes of content to be processed as part of the research task.&lt;/p&gt;
&lt;p&gt;Providing a "memory" mechanism is important as well:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The LeadResearcher begins by thinking through the approach and saving its plan to Memory to persist the context, since if the context window exceeds 200,000 tokens it will be truncated and it is important to retain the plan.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The rest of the article provides a detailed description of the prompt engineering process needed to build a truly effective system:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Early agents made errors like spawning 50 subagents for simple queries, scouring the web endlessly for nonexistent sources, and distracting each other with excessive updates. Since each agent is steered by a prompt, prompt engineering was our primary lever for improving these behaviors. [...]&lt;/p&gt;
&lt;p&gt;In our system, the lead agent decomposes queries into subtasks and describes them to subagents. Each subagent needs an objective, an output format, guidance on the tools and sources to use, and clear task boundaries.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;They got good results from having special agents help optimize those crucial tool descriptions:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We even created a tool-testing agent—when given a flawed MCP tool, it attempts to use the tool and then rewrites the tool description to avoid failures. By testing the tool dozens of times, this agent found key nuances and bugs. This process for improving tool ergonomics resulted in a 40% decrease in task completion time for future agents using the new description, because they were able to avoid most mistakes.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Sub-agents can run in parallel which provides significant performance boosts:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;For speed, we introduced two kinds of parallelization: (1) the lead agent spins up 3-5 subagents in parallel rather than serially; (2) the subagents use 3+ tools in parallel. These changes cut research time by up to 90% for complex queries, allowing Research to do more work in minutes instead of hours while covering more information than other systems.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There's also an extensive section about their approach to evals - they found that LLM-as-a-judge worked well for them, but human evaluation was essential as well:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We often hear that AI developer teams delay creating evals because they believe that only large evals with hundreds of test cases are useful. However, it’s best to start with small-scale testing right away with a few examples, rather than delaying until you can build more thorough evals. [...]&lt;/p&gt;
&lt;p&gt;In our case, human testers noticed that our early agents consistently chose SEO-optimized content farms over authoritative but less highly-ranked sources like academic PDFs or personal blogs. Adding source quality heuristics to our prompts helped resolve this issue.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;There's so much useful, actionable advice in this piece. I haven't seen anything else about multi-agent system design that's anywhere near this practical.&lt;/p&gt;
&lt;p&gt;They even added &lt;a class="external" href="https://github.com/anthropics/anthropic-cookbook/tree/main/patterns/agents/prompts" rel="nofollow"&gt;some example prompts&lt;/a&gt; from their Research system to their open source prompting cookbook. Here's &lt;a class="external" href="https://github.com/anthropics/anthropic-cookbook/blob/46f21f95981e3633d7b1eac235351de4842cf9f0/patterns/agents/prompts/research_lead_agent.md?plain=1#L135-L137" rel="nofollow"&gt;the bit&lt;/a&gt; that encourages parallel tool use:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;&amp;lt;use_parallel_tool_calls&amp;gt; For maximum efficiency, whenever you need to perform multiple independent operations, invoke all relevant tools simultaneously rather than sequentially. Call tools in parallel to run subagents at the same time. You MUST use parallel tool calls for creating multiple subagents (typically running 3 subagents at the same time) at the start of the research, unless it is a straightforward query. For all other queries, do any necessary quick initial planning or investigation yourself, then run multiple subagents in parallel. Leave any extensive tool calls to the subagents; instead, focus on running subagents in parallel efficiently. &amp;lt;/use_parallel_tool_calls&amp;gt;&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And an interesting description of &lt;a class="external" href="https://github.com/anthropics/anthropic-cookbook/blob/46f21f95981e3633d7b1eac235351de4842cf9f0/patterns/agents/prompts/research_subagent.md?plain=1#L10" rel="nofollow"&gt;the OODA research loop&lt;/a&gt; used by the sub-agents: &lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;code&gt;Research loop: Execute an excellent OODA (observe, orient, decide, act) loop by (a) observing what information has been gathered so far, what still needs to be gathered to accomplish the task, and what tools are available currently; (b) orienting toward what tools and queries would be best to gather the needed information and updating beliefs based on what has been learned so far; (c) making an informed, well-reasoned decision to use a specific tool in a certain way; (d) acting to use this tool. Repeat this loop in an efficient way to research well and learn based on new results.&lt;/code&gt;&lt;/p&gt;
&lt;/blockquote&gt; &lt;/div&gt;&lt;/div&gt;</summary></entry><entry><title>Tips on prompting ChatGPT for UK technology secretary Peter Kyle</title><link href="https://simonwillison.net/2025/Jun/3/tips-for-peter-kyle/#atom-everything" rel="alternate"></link><published>2025-06-06T15:00:53.536000Z</published><id>https://simonwillison.net/2025/Jun/3/tips-for-peter-kyle/#atom-everything</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/tips-on-prompting-ch/790:675308"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/790.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Simon Willison&amp;#x27;s Weblog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt; &lt;p class="mobile-date"&gt;3rd June 2025&lt;/p&gt; &lt;p&gt;Back in March &lt;a class="external" href="https://www.newscientist.com/article/2472068-revealed-how-the-uk-tech-secretary-uses-chatgpt-for-policy-advice/" rel="nofollow"&gt;New Scientist reported on&lt;/a&gt; a successful Freedom of Information request they had filed requesting UK Secretary of State for Science, Innovation and Technology &lt;a class="external" href="https://en.wikipedia.org/wiki/Peter_Kyle" rel="nofollow"&gt;Peter Kyle’s&lt;/a&gt; ChatGPT logs:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;New Scientist has obtained records of Kyle’s ChatGPT use under the Freedom of Information (FOI) Act, in what is believed to be a world-first test of whether chatbot interactions are subject to such laws.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;What a fascinating precedent this could set!&lt;/p&gt;
&lt;p&gt;They picked out some highlights they thought were particularly newsworthy. Personally I’d have loved to see that raw data to accompany the story.&lt;/p&gt; &lt;p&gt;Among the questions Kyle asked of ChatGPT was this one:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Why is AI adoption so slow in the UK small and medium business community?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;(I pinged the New Scientist reporter, Chris Stokel-Walker, to confirm the exact wording here.)&lt;/p&gt;
&lt;p&gt;This provides an irresistible example of the “jagged frontier” of LLMs in action. LLMs are great at some things, terrible at others and the difference between the two is often not obvious at all.&lt;/p&gt;
&lt;p&gt;Experienced prompters will no doubt have the same reaction I did: that’s not going to give an accurate response! It’s worth digging into why those of us with a firmly developed sense of intuition around LLMs would jump straight to that conclusion.&lt;/p&gt;
&lt;p&gt;The problem with this question is that it assumes a level of omniscience that even the very best LLMs do not possess.&lt;/p&gt;
&lt;p&gt;At the very best, I would expect this prompt to spit out the approximate average of what had been published on that subject in time to be hoovered up by the training data for the GPT-4o training cutoff &lt;a class="external" href="https://platform.openai.com/docs/models/gpt-4o" rel="nofollow"&gt;of September 2023&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;(Here’s &lt;a class="external" href="https://chatgpt.com/share/683f3f94-d51c-8006-aea9-7567d08e2f68" rel="nofollow"&gt;what I got just now&lt;/a&gt; running it against GPT-4o.)&lt;/p&gt;
&lt;p&gt;This illustrates the first lesson of effective LLM usage: &lt;strong&gt;know your training cutoff dates&lt;/strong&gt;. For many queries these are an essential factor in whether or not the LLM is likely to provide you with a useful answer.&lt;/p&gt;
&lt;p&gt;Given the pace of change in the AI landscape, an answer based on September 2023 training data is unlikely to offer useful insights into the state of things in 2025.&lt;/p&gt;
&lt;p&gt;It’s worth noting that there &lt;em&gt;are&lt;/em&gt; tools that might do better at this. OpenAI’s Deep Research tool for example can run a barrage of searches against the web for recent information, then spend multiple minutes digesting those results, running follow-up searches and crunching that together into an impressive looking report.&lt;/p&gt;
&lt;p&gt;(I still wouldn’t trust it for a question this broad though: the report format looks more credible than it is, and can suffer from &lt;a class="external" href="https://simonwillison.net/2025/Feb/25/deep-research-system-card/" rel="nofollow"&gt;misinformation by omission&lt;/a&gt; which is very difficult to spot.)&lt;/p&gt;
&lt;p&gt;Deep Research only rolled out in February this year, so it is unlikely to be the tool Peter Kyle was using given likely delays in receiving the requested FOIA data.&lt;/p&gt;
&lt;h4&gt;What I would do instead&lt;/h4&gt;
&lt;p&gt;Off the top of my head, here are examples of prompts I would use if I wanted to get ChatGPT’s help digging into this particular question:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Brainstorm potential reasons that UK SMBs might be slow to embrace recent advances in AI&lt;/strong&gt;. This would give me a starting point for my own thoughts about the subject, and may highlight some things I hadn’t considered that I should look into further.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Identify key stakeholders in the UK SMB community who might have insights on this issue&lt;/strong&gt;. I wouldn’t expect anything comprehensive here, but it might turn up some initial names I could reach out to for interviews or further research.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;I work in UK Government: which departments should I contact that might have relevant information on this topic&lt;/strong&gt;? Given the size and complexity of the UK government even cabinet ministers could be excused from knowing every department.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suggest other approaches I could take to research this issue&lt;/strong&gt;. Another brainstorming prompt. I like prompts like this where “right or wrong” doesn’t particularly matter. LLMs are electric bicycles for the mind.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use your search tool: find recent credible studies on the subject and identify their authors&lt;/strong&gt;. I’ve been getting some good results from telling LLMs with good search tools—&lt;a class="external" href="https://simonwillison.net/2025/Apr/21/ai-assisted-search/#o3-and-o4-mini-are-really-good-at-search" rel="nofollow"&gt;like o3 and o4-mini&lt;/a&gt;—to evaluate the “credibility” of sources they find. It’s a dumb prompting hack but it appears to work quite well—you can watch their reasoning traces and see how they place more faith in papers from well known publications, or newspapers with strong reputations for fact checking.&lt;/li&gt;
&lt;/ul&gt;
&lt;h4&gt;Prompts that do make sense&lt;/h4&gt;
&lt;p&gt;From the New Scientist article:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;As well as seeking this advice, Kyle asked ChatGPT to define various terms relevant to his department: antimatter, quantum and digital inclusion. Two experts &lt;em&gt;New Scientist&lt;/em&gt; spoke to said they were surprised by the quality of the responses when it came to ChatGPT’s definitions of quantum. “This is surprisingly good, in my opinion,” says &lt;a class="external" href="https://profiles.imperial.ac.uk/p.knight" rel="nofollow"&gt;Peter Knight&lt;/a&gt; at Imperial College London. “I think it’s not bad at all,” says &lt;a class="external" href="https://researchportal.hw.ac.uk/en/persons/cristian-bonato" rel="nofollow"&gt;Cristian Bonato&lt;/a&gt; at Heriot-Watt University in Edinburgh, UK.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This doesn’t surprise me at all. If you ask a good LLM for definitions of terms with strong, well established meanings you’re going to get great results almost every time.&lt;/p&gt;
&lt;p&gt;My rule of thumb used to be that if a friend who had just read the Wikipedia page on a subject could answer my question then an LLM will be able to answer it too.&lt;/p&gt;
&lt;p&gt;As the frontier models have grown stronger I’ve upgraded that rule of thumb. I now expect a good result for any mainstream-enough topic for which there was widespread consensus prior to that all-important training cutoff date.&lt;/p&gt;
&lt;p&gt;Once again, it all comes down to intuition. The only way to get really strong intuition as to what will work with LLMs is to spend a huge amount of time using them, and paying a skeptical eye to everything that they produce.&lt;/p&gt;
&lt;p&gt;Treating ChatGPT as an all knowing Oracle for anything outside of a two year stale Wikipedia version of the world’s knowledge is almost always a mistake.&lt;/p&gt;
&lt;p&gt;Treating it as a brainstorming companion and electric bicycle for the mind is, I think, a much better strategy.&lt;/p&gt;
&lt;h4&gt;Should the UK technology secretary be using ChatGPT?&lt;/h4&gt;
&lt;p&gt;Some of the reporting I’ve seen around this story has seemed to suggest that Peter Kyle’s use of ChatGPT is embarrassing.&lt;/p&gt;
&lt;p&gt;Personally, I think that if the UK’s Secretary of State for Science, Innovation and Technology was &lt;em&gt;not&lt;/em&gt; exploring this family of technologies it would be a dereliction of duty!&lt;/p&gt;
&lt;p&gt;The thing we can’t tell from these ChatGPT logs is how dependent he was on these results.&lt;/p&gt;
&lt;p&gt;Did he idly throw some questions at ChatGPT out of curiosity to see what came back, then ignore that entirely, engage with his policy team and talk to experts in the field to get a detailed understanding of the issues at hand?&lt;/p&gt;
&lt;p&gt;Or did he prompt ChatGPT, take the results as gospel and make policy decisions based on that sloppy interpretation of a two-year stale guess at the state of the world?&lt;/p&gt;
&lt;p&gt;Those are the questions I’d like to see answered.&lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>Introduction#</title><link href="https://module-federation.io/guide/start/index.html" rel="alternate"></link><published>2025-04-03T10:10:54.009000Z</published><id>https://module-federation.io/guide/start/index.html</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/introduction/0:6439d5"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="rspress-doc"&gt; &lt;p class="my-4 leading-7"&gt;Module Federation is an architectural pattern for the decentralization of JavaScript applications (similar to microservices on the server-side). It allows you to share code and resources among multiple JavaScript applications (or micro-frontends). This can help you:&lt;/p&gt;
&lt;ul class="list-disc pl-5 my-4 leading-7"&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Reduce code duplication&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Improve code maintainability&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Lower the overall size of your applications&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Enhance the performance of your applications&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="mt-10 mb-2 leading-7 text-xl title_3b154"&gt;✨ What is Module Federation 2.0?&lt;a class="link_3b154 header-anchor" href="http://#-what-is-module-federation-20" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p class="my-4 leading-7"&gt;&lt;code&gt;Module Federation 2.0&lt;/code&gt; differs from the &lt;code&gt;Module Federation&lt;/code&gt; built into &lt;code&gt;Webpack5&lt;/code&gt; by providing not only the core features of module export, loading, and dependency sharing but also additional dynamic type hinting, &lt;code&gt;Manifest&lt;/code&gt;, &lt;code&gt;Federation Runtime&lt;/code&gt;, and &lt;code&gt;Runtime Plugin System&lt;/code&gt;. These features make &lt;code&gt;Module Federation&lt;/code&gt; more suitable for use as a micro-frontend architecture in large-scale &lt;code&gt;Web&lt;/code&gt; applications.&lt;/p&gt;
&lt;h3 class="mt-10 mb-2 leading-7 text-xl title_3b154"&gt;🔥 Features&lt;a class="link_3b154 header-anchor" href="http://#-features" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p class="my-4 leading-7"&gt;Module Federation has the following features:&lt;/p&gt; &lt;h3 class="mt-10 mb-2 leading-7 text-xl title_3b154"&gt;🎯 Use Cases&lt;a class="link_3b154 header-anchor" href="http://#-use-cases" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p class="my-4 leading-7"&gt;Module Federation is suitable for the following scenarios:&lt;/p&gt;
&lt;ul class="list-disc pl-5 my-4 leading-7"&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;strong class="font-semibold"&gt;Large Applications&lt;/strong&gt;: For large applications, you can break the application into multiple micro-frontends and use Module Federation to share code and resources between them.&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;strong class="font-semibold"&gt;Microfrontend Architecture&lt;/strong&gt;: Module Federation is an ideal tool for building microfrontend architectures.&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;strong class="font-semibold"&gt;Multi-team Development&lt;/strong&gt;: Module Federation can assist multiple teams in collaboratively developing large applications.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 class="mt-10 mb-2 leading-7 text-xl title_3b154"&gt;🕠 History of Module Federation&lt;a class="link_3b154 header-anchor" href="http://#-history-of-module-federation" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p class="my-4 leading-7"&gt;Module Federation is a new feature introduced in Webpack 5, but its history dates back to 2017. At that time, the Webpack team began exploring a way to share code between multiple applications.&lt;/p&gt;
&lt;ul class="list-disc pl-5 my-4 leading-7"&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;
&lt;p class="my-4 leading-7"&gt;In 2018, Webpack 4.20 was released, introducing module hooks, which laid the foundation for the development of Module Federation.&lt;/p&gt;
&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;
&lt;p class="my-4 leading-7"&gt;In 2019, Webpack 5 was released, officially introducing the Module Federation feature.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p class="my-4 leading-7"&gt;Module Federation has become a powerful tool for building modern web applications.&lt;/p&gt;
&lt;h3 class="mt-10 mb-2 leading-7 text-xl title_3b154"&gt;🕰️ The Future of Module Federation&lt;a class="link_3b154 header-anchor" href="http://#️-the-future-of-module-federation" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h3&gt;
&lt;p class="my-4 leading-7"&gt;Module Federation aims to become an architectural method for building large web applications, similar to microservices in the backend. Module Federation will provide more capabilities to meet the foundational needs of large web application decentralization, currently including these parts:&lt;/p&gt;
&lt;ul class="list-disc pl-5 my-4 leading-7"&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Providing comprehensive Devtool tools&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Offering more high-level framework capabilities like Router, Sandbox, SSR&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;Providing best practices for large web applications based on Module Federation&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="mt-12 mb-6 pt-8 text-2xl tracking-tight border-t-[1px] border-divider-light title_3b154"&gt;Follow Us&lt;a class="link_3b154 header-anchor" href="http://#follow-us" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;ul class="list-disc pl-5 my-4 leading-7"&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;a class="link_03735 link_3b154 inline-link_3b154" href="https://github.com/module-federation/core" rel="nofollow"&gt;GitHub - Star us on GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;a class="link_03735 link_3b154 inline-link_3b154" href="https://discord.com/channels/1055442562959290389/1055442563718467637" rel="nofollow"&gt;Discord&lt;/a&gt;&lt;/li&gt;
&lt;li class="[&amp;amp;:not(:first-child)]:mt-2"&gt;&lt;a class="link_03735 link_3b154 inline-link_3b154" href="https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=a41s8f79-741f-41ba-8349-395d9a0e9662" rel="nofollow"&gt;Lark Group (Chinese Community)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 class="mt-12 mb-6 pt-8 text-2xl tracking-tight border-t-[1px] border-divider-light title_3b154"&gt;✨ Next Steps&lt;a class="link_3b154 header-anchor" href="http://#-next-steps" rel="nofollow"&gt;#&lt;/a&gt;&lt;/h2&gt;
&lt;p class="my-4 leading-7"&gt;You might want to:&lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>GitHub - PriorLabs/TabPFN</title><link href="https://github.com/PriorLabs/TabPFN" rel="alternate"></link><published>2025-04-03T09:45:56.916000Z</published><id>https://github.com/PriorLabs/TabPFN</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/github-priorlabstabp/0:0c7b68"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;div class="application-main"&gt; &lt;p&gt;Official installation (pip)&lt;/p&gt; &lt;p&gt;OR installation from source&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;pip install &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;tabpfn @ git+https://github.com/PriorLabs/TabPFN.git&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;OR local development installation&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;git clone &amp;lt;a href="https://github.com/PriorLabs/TabPFN.git" rel="nofollow"&amp;gt;https://github.com/PriorLabs/TabPFN.git&amp;lt;/a&amp;gt;
pip install -e &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;TabPFN[dev]&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt; &lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;datasets&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;load_breast_cancer&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;metrics&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;accuracy_score&lt;/span&gt;, &lt;span class="pl-s1"&gt;roc_auc_score&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;model_selection&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;train_test_split&lt;/span&gt; &lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;tabpfn&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;TabPFNClassifier&lt;/span&gt; &lt;span class="pl-c"&gt;# Load data&lt;/span&gt;
&lt;span class="pl-c1"&gt;X&lt;/span&gt;, &lt;span class="pl-s1"&gt;y&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;load_breast_cancer&lt;/span&gt;(&lt;span class="pl-s1"&gt;return_X_y&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;)
&lt;span class="pl-v"&gt;X_train&lt;/span&gt;, &lt;span class="pl-v"&gt;X_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_train&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_test&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;train_test_split&lt;/span&gt;(&lt;span class="pl-c1"&gt;X&lt;/span&gt;, &lt;span class="pl-s1"&gt;y&lt;/span&gt;, &lt;span class="pl-s1"&gt;test_size&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;0.5&lt;/span&gt;, &lt;span class="pl-s1"&gt;random_state&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;42&lt;/span&gt;) &lt;span class="pl-c"&gt;# Initialize a classifier&lt;/span&gt;
&lt;span class="pl-s1"&gt;clf&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;TabPFNClassifier&lt;/span&gt;()
&lt;span class="pl-s1"&gt;clf&lt;/span&gt;.&lt;span class="pl-c1"&gt;fit&lt;/span&gt;(&lt;span class="pl-v"&gt;X_train&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_train&lt;/span&gt;) &lt;span class="pl-c"&gt;# Predict probabilities&lt;/span&gt;
&lt;span class="pl-s1"&gt;prediction_probabilities&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;clf&lt;/span&gt;.&lt;span class="pl-c1"&gt;predict_proba&lt;/span&gt;(&lt;span class="pl-v"&gt;X_test&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;"ROC AUC:"&lt;/span&gt;, &lt;span class="pl-en"&gt;roc_auc_score&lt;/span&gt;(&lt;span class="pl-s1"&gt;y_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;prediction_probabilities&lt;/span&gt;[:, &lt;span class="pl-c1"&gt;1&lt;/span&gt;])) &lt;span class="pl-c"&gt;# Predict labels&lt;/span&gt;
&lt;span class="pl-s1"&gt;predictions&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;clf&lt;/span&gt;.&lt;span class="pl-c1"&gt;predict&lt;/span&gt;(&lt;span class="pl-v"&gt;X_test&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;"Accuracy"&lt;/span&gt;, &lt;span class="pl-en"&gt;accuracy_score&lt;/span&gt;(&lt;span class="pl-s1"&gt;y_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;predictions&lt;/span&gt;))&lt;/pre&gt;&lt;/div&gt; &lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;datasets&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;fetch_openml&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;metrics&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;mean_squared_error&lt;/span&gt;, &lt;span class="pl-s1"&gt;r2_score&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;sklearn&lt;/span&gt;.&lt;span class="pl-s1"&gt;model_selection&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-s1"&gt;train_test_split&lt;/span&gt; &lt;span class="pl-c"&gt;# Assuming there is a TabPFNRegressor (if not, a different regressor should be used)&lt;/span&gt;
&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;tabpfn&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;TabPFNRegressor&lt;/span&gt; &lt;span class="pl-c"&gt;# Load Boston Housing data&lt;/span&gt;
&lt;span class="pl-s1"&gt;df&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;fetch_openml&lt;/span&gt;(&lt;span class="pl-s1"&gt;data_id&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;531&lt;/span&gt;, &lt;span class="pl-s1"&gt;as_frame&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;True&lt;/span&gt;) &lt;span class="pl-c"&gt;# Boston Housing dataset&lt;/span&gt;
&lt;span class="pl-c1"&gt;X&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;df&lt;/span&gt;.&lt;span class="pl-c1"&gt;data&lt;/span&gt;
&lt;span class="pl-s1"&gt;y&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;df&lt;/span&gt;.&lt;span class="pl-c1"&gt;target&lt;/span&gt;.&lt;span class="pl-c1"&gt;astype&lt;/span&gt;(&lt;span class="pl-s1"&gt;float&lt;/span&gt;) &lt;span class="pl-c"&gt;# Ensure target is float for regression&lt;/span&gt; &lt;span class="pl-c"&gt;# Train-test split&lt;/span&gt;
&lt;span class="pl-v"&gt;X_train&lt;/span&gt;, &lt;span class="pl-v"&gt;X_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_train&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_test&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;train_test_split&lt;/span&gt;(&lt;span class="pl-c1"&gt;X&lt;/span&gt;, &lt;span class="pl-s1"&gt;y&lt;/span&gt;, &lt;span class="pl-s1"&gt;test_size&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;0.5&lt;/span&gt;, &lt;span class="pl-s1"&gt;random_state&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;42&lt;/span&gt;) &lt;span class="pl-c"&gt;# Initialize the regressor&lt;/span&gt;
&lt;span class="pl-s1"&gt;regressor&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;TabPFNRegressor&lt;/span&gt;() &lt;span class="pl-s1"&gt;regressor&lt;/span&gt;.&lt;span class="pl-c1"&gt;fit&lt;/span&gt;(&lt;span class="pl-v"&gt;X_train&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_train&lt;/span&gt;) &lt;span class="pl-c"&gt;# Predict on the test set&lt;/span&gt;
&lt;span class="pl-s1"&gt;predictions&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;regressor&lt;/span&gt;.&lt;span class="pl-c1"&gt;predict&lt;/span&gt;(&lt;span class="pl-v"&gt;X_test&lt;/span&gt;) &lt;span class="pl-c"&gt;# Evaluate the model&lt;/span&gt;
&lt;span class="pl-s1"&gt;mse&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;mean_squared_error&lt;/span&gt;(&lt;span class="pl-s1"&gt;y_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;predictions&lt;/span&gt;)
&lt;span class="pl-s1"&gt;r2&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;r2_score&lt;/span&gt;(&lt;span class="pl-s1"&gt;y_test&lt;/span&gt;, &lt;span class="pl-s1"&gt;predictions&lt;/span&gt;) &lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;"Mean Squared Error (MSE):"&lt;/span&gt;, &lt;span class="pl-s1"&gt;mse&lt;/span&gt;)
&lt;span class="pl-en"&gt;print&lt;/span&gt;(&lt;span class="pl-s"&gt;"R² Score:"&lt;/span&gt;, &lt;span class="pl-s1"&gt;r2&lt;/span&gt;)&lt;/pre&gt;&lt;/div&gt; &lt;p&gt;For optimal performance, use the &lt;code&gt;AutoTabPFNClassifier&lt;/code&gt; or &lt;code&gt;AutoTabPFNRegressor&lt;/code&gt; for post-hoc ensembling. These can be found in the &lt;a class="external" href="https://github.com/PriorLabs/tabpfn-extensions" rel="nofollow"&gt;TabPFN Extensions&lt;/a&gt; repository. Post-hoc ensembling combines multiple TabPFN models into an ensemble.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Steps for Best Results:&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Install the extensions:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;git clone &amp;lt;a href="https://github.com/priorlabs/tabpfn-extensions.git" rel="nofollow"&amp;gt;https://github.com/priorlabs/tabpfn-extensions.git&amp;lt;/a&amp;gt;
pip install -e tabpfn-extensions&lt;/pre&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;div class="highlight highlight-source-python notranslate position-relative overflow-auto"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;from&lt;/span&gt; &lt;span class="pl-s1"&gt;tabpfn_extensions&lt;/span&gt;.&lt;span class="pl-s1"&gt;post_hoc_ensembles&lt;/span&gt;.&lt;span class="pl-s1"&gt;sklearn_interface&lt;/span&gt; &lt;span class="pl-k"&gt;import&lt;/span&gt; &lt;span class="pl-v"&gt;AutoTabPFNClassifier&lt;/span&gt; &lt;span class="pl-s1"&gt;clf&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-en"&gt;AutoTabPFNClassifier&lt;/span&gt;(&lt;span class="pl-s1"&gt;max_time&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-c1"&gt;120&lt;/span&gt;, &lt;span class="pl-s1"&gt;device&lt;/span&gt;&lt;span class="pl-c1"&gt;=&lt;/span&gt;&lt;span class="pl-s"&gt;"cuda"&lt;/span&gt;) &lt;span class="pl-c"&gt;# 120 seconds tuning time&lt;/span&gt;
&lt;span class="pl-s1"&gt;clf&lt;/span&gt;.&lt;span class="pl-c1"&gt;fit&lt;/span&gt;(&lt;span class="pl-v"&gt;X_train&lt;/span&gt;, &lt;span class="pl-s1"&gt;y_train&lt;/span&gt;)
&lt;span class="pl-s1"&gt;predictions&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-s1"&gt;clf&lt;/span&gt;.&lt;span class="pl-c1"&gt;predict&lt;/span&gt;(&lt;span class="pl-v"&gt;X_test&lt;/span&gt;)&lt;/pre&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt; &lt;p&gt;Choose the right TabPFN implementation for your needs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class="external" href="https://github.com/priorlabs/tabpfn-client" rel="nofollow"&gt;TabPFN Client&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;
Simple API client for using TabPFN via cloud-based inference.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class="external" href="https://github.com/priorlabs/tabpfn-extensions" rel="nofollow"&gt;TabPFN Extensions&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;
A powerful companion repository packed with advanced utilities, integrations, and features - great place to contribute:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;🔍 &lt;strong&gt;&lt;code&gt;interpretability&lt;/code&gt;&lt;/strong&gt;: Gain insights with SHAP-based explanations, feature importance, and selection tools.&lt;/li&gt;
&lt;li&gt;🕵️‍♂️ &lt;strong&gt;&lt;code&gt;unsupervised&lt;/code&gt;&lt;/strong&gt;: Tools for outlier detection and synthetic tabular data generation.&lt;/li&gt;
&lt;li&gt;🧬 &lt;strong&gt;&lt;code&gt;embeddings&lt;/code&gt;&lt;/strong&gt;: Extract and use TabPFN’s internal learned embeddings for downstream tasks or analysis.&lt;/li&gt;
&lt;li&gt;🧠 &lt;strong&gt;&lt;code&gt;many_class&lt;/code&gt;&lt;/strong&gt;: Handle multi-class classification problems that exceed TabPFN's built-in class limit.&lt;/li&gt;
&lt;li&gt;🌲 &lt;strong&gt;&lt;code&gt;rf_pfn&lt;/code&gt;&lt;/strong&gt;: Combine TabPFN with traditional models like Random Forests for hybrid approaches.&lt;/li&gt;
&lt;li&gt;⚙️ &lt;strong&gt;&lt;code&gt;hpo&lt;/code&gt;&lt;/strong&gt;: Automated hyperparameter optimization tailored to TabPFN.&lt;/li&gt;
&lt;li&gt;🔁 &lt;strong&gt;&lt;code&gt;post_hoc_ensembles&lt;/code&gt;&lt;/strong&gt;: Boost performance by ensembling multiple TabPFN models post-training.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;✨ To install:&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;git clone &amp;lt;a href="https://github.com/priorlabs/tabpfn-extensions.git" rel="nofollow"&amp;gt;https://github.com/priorlabs/tabpfn-extensions.git&amp;lt;/a&amp;gt;
pip install -e tabpfn-extensions&lt;/pre&gt;&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class="external" href="https://github.com/priorlabs/tabpfn" rel="nofollow"&gt;TabPFN (this repo)&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;
Core implementation for fast and local inference with PyTorch and CUDA support.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;a class="external" href="https://ux.priorlabs.ai" rel="nofollow"&gt;TabPFN UX&lt;/a&gt;&lt;/strong&gt;&lt;br/&gt;
No-code graphical interface to explore TabPFN capabilities—ideal for business users and prototyping.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt; &lt;p&gt;Prior Labs License (Apache 2.0 with additional attribution requirement): &lt;a class="external" href="https://priorlabs.ai/tabpfn-license/" rel="nofollow"&gt;here&lt;/a&gt;&lt;/p&gt; &lt;p&gt;We're building the future of tabular machine learning and would love your involvement:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Connect &amp;amp; Learn&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Join our &lt;a class="external" href="https://discord.gg/VJRuU3bSxt" rel="nofollow"&gt;Discord Community&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Read our &lt;a class="external" href="https://priorlabs.ai/docs" rel="nofollow"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Check out &lt;a class="external" href="https://github.com/priorlabs/tabpfn/issues" rel="nofollow"&gt;GitHub Issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Contribute&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Report bugs or request features&lt;/li&gt;
&lt;li&gt;Submit pull requests&lt;/li&gt;
&lt;li&gt;Share your research and use cases&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Stay Updated&lt;/strong&gt;: Star the repo and join Discord for the latest updates&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt; &lt;p&gt;You can read our paper explaining TabPFN &lt;a class="external" href="https://doi.org/10.1038/s41586-024-08328-6" rel="nofollow"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;div class="highlight highlight-text-bibtex notranslate position-relative overflow-auto"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;@article&lt;/span&gt;{&lt;span class="pl-en"&gt;hollmann2025tabpfn&lt;/span&gt;, &lt;span class="pl-s"&gt;title&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;Accurate predictions on small data with a tabular foundation model&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;author&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;Hollmann, Noah and M{\"u}ller, Samuel and Purucker, Lennart and&lt;/span&gt;
&lt;span class="pl-s"&gt; Krishnakumar, Arjun and K{\"o}rfer, Max and Hoo, Shi Bin and&lt;/span&gt;
&lt;span class="pl-s"&gt; Schirrmeister, Robin Tibor and Hutter, Frank&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;journal&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;Nature&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;year&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;2025&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;month&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;01&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;day&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;09&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;doi&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;10.1038/s41586-024-08328-6&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;publisher&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;Springer Nature&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;url&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;&amp;lt;a href="https://www.nature.com/articles/s41586-024-08328-6" rel="nofollow"&amp;gt;https://www.nature.com/articles/s41586-024-08328-6&amp;lt;/a&amp;gt;&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;,
} &lt;span class="pl-k"&gt;@inproceedings&lt;/span&gt;{&lt;span class="pl-en"&gt;hollmann2023tabpfn&lt;/span&gt;, &lt;span class="pl-s"&gt;title&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;TabPFN: A transformer that solves small tabular classification problems in a second&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;author&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;Hollmann, Noah and M{\"u}ller, Samuel and Eggensperger, Katharina and Hutter, Frank&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;booktitle&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;International Conference on Learning Representations 2023&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;, &lt;span class="pl-s"&gt;year&lt;/span&gt;=&lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;{&lt;/span&gt;2023&lt;span class="pl-pds"&gt;}&lt;/span&gt;&lt;/span&gt;
}&lt;/pre&gt;&lt;/div&gt; &lt;p&gt;&lt;strong&gt;Q: What dataset sizes work best with TabPFN?&lt;/strong&gt;&lt;br/&gt;
A: TabPFN is optimized for &lt;strong&gt;datasets up to 10,000 rows&lt;/strong&gt;. For larger datasets, consider using &lt;strong&gt;Random Forest preprocessing&lt;/strong&gt; or other extensions. See our &lt;a class="external" href="https://colab.research.google.com/drive/154SoIzNW1LHBWyrxNwmBqtFAr1uZRZ6a#scrollTo=OwaXfEIWlhC8" rel="nofollow"&gt;Colab notebook&lt;/a&gt; for strategies.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: Why can't I use TabPFN with Python 3.8?&lt;/strong&gt;&lt;br/&gt;
A: TabPFN v2 requires &lt;strong&gt;Python 3.9+&lt;/strong&gt; due to newer language features. Compatible versions: &lt;strong&gt;3.9, 3.10, 3.11, 3.12, 3.13&lt;/strong&gt;.&lt;/p&gt; &lt;p&gt;&lt;strong&gt;Q: How do I use TabPFN without an internet connection?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;TabPFN automatically downloads model weights when first used. For offline usage:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Using the Provided Download Script&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If you have the TabPFN repository, you can use the included script to download all models (including ensemble variants):&lt;/p&gt;
&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;&lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; After installing TabPFN&lt;/span&gt;
python scripts/download_all_models.py&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This script will download the main classifier and regressor models, as well as all ensemble variant models to your system's default cache directory.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Manual Download&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Download the model files manually from HuggingFace:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Classifier: &lt;a class="external" href="https://huggingface.co/Prior-Labs/TabPFN-v2-clf/resolve/main/tabpfn-v2-classifier.ckpt" rel="nofollow"&gt;tabpfn-v2-classifier.ckpt&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Regressor: &lt;a class="external" href="https://huggingface.co/Prior-Labs/TabPFN-v2-reg/resolve/main/tabpfn-v2-regressor.ckpt" rel="nofollow"&gt;tabpfn-v2-regressor.ckpt&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Place the file in one of these locations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Specify directly: &lt;code&gt;TabPFNClassifier(model_path="/path/to/model.ckpt")&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Set environment variable: &lt;code&gt;os.environ["TABPFN_MODEL_CACHE_DIR"] = "/path/to/dir"&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Default OS cache directory:
&lt;ul&gt;
&lt;li&gt;Windows: &lt;code&gt;%APPDATA%\tabpfn\&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;macOS: &lt;code&gt;~/Library/Caches/tabpfn/&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Linux: &lt;code&gt;~/.cache/tabpfn/&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;Q: I'm getting a &lt;code&gt;pickle&lt;/code&gt; error when loading the model. What should I do?&lt;/strong&gt;&lt;br/&gt;
A: Try the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Download the newest version of tabpfn &lt;code&gt;pip install tabpfn --upgrade&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Ensure model files downloaded correctly (re-download if needed)&lt;/li&gt;
&lt;/ul&gt; &lt;p&gt;&lt;strong&gt;Q: Can TabPFN handle missing values?&lt;/strong&gt;&lt;br/&gt;
A: &lt;strong&gt;Yes!&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Q: How can I improve TabPFN’s performance?&lt;/strong&gt;&lt;br/&gt;
A: Best practices:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;AutoTabPFNClassifier&lt;/strong&gt; from &lt;a class="external" href="https://github.com/priorlabs/tabpfn-extensions" rel="nofollow"&gt;TabPFN Extensions&lt;/a&gt; for post-hoc ensembling&lt;/li&gt;
&lt;li&gt;Feature engineering: Add domain-specific features to improve model performance&lt;br/&gt;
Not effective:
&lt;ul&gt;
&lt;li&gt;Adapt feature scaling&lt;/li&gt;
&lt;li&gt;Convert categorical features to numerical values (e.g., one-hot encoding)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt; &lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;python -m venv venv
&lt;span class="pl-c1"&gt;source&lt;/span&gt; venv/bin/activate &lt;span class="pl-c"&gt;&lt;span class="pl-c"&gt;#&lt;/span&gt; On Windows: venv\Scripts\activate&lt;/span&gt;
git clone &amp;lt;a href="https://github.com/PriorLabs/TabPFN.git" rel="nofollow"&amp;gt;https://github.com/PriorLabs/TabPFN.git&amp;lt;/a&amp;gt;
&lt;span class="pl-c1"&gt;cd&lt;/span&gt; tabpfn
pip install -e &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;.[dev]&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
pre-commit install&lt;/pre&gt;&lt;/div&gt; &lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto"&gt;&lt;pre&gt;pre-commit run --all-files&lt;/pre&gt;&lt;/div&gt; &lt;p&gt;Built with ❤️ by &lt;a class="external" href="https://priorlabs.ai" rel="nofollow"&gt;Prior Labs&lt;/a&gt; - Copyright (c) 2025 Prior Labs GmbH&lt;/p&gt;  &lt;/div&gt;&lt;p class="ajax-error-message"&gt;  You can’t perform that action at this time. &lt;/p&gt;&lt;/div&gt;</summary></entry><entry><title>Minimal CSS-only blurry image placeholders</title><link href="https://leanrada.com/notes/css-only-lqip/" rel="alternate"></link><published>2025-04-03T08:15:53.910000Z</published><id>https://leanrada.com/notes/css-only-lqip/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/minimal-css-only-blu/0:72f949"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

</summary></entry><entry><title>qdm12/gluetun: VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in.</title><link href="https://github.com/qdm12/gluetun" rel="alternate"></link><published>2025-03-14T16:20:59.061000Z</published><id>https://github.com/qdm12/gluetun</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/qdm12gluetun-vpn-cli/0:56cbcc"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;div class="application-main"&gt; &lt;/div&gt;&lt;p class="ajax-error-message"&gt;  You can’t perform that action at this time. &lt;/p&gt;&lt;/div&gt;</summary></entry><entry><title>How MIG maximizes GPU efficiency on OpenShift AI | Red Hat Developer</title><link href="https://developers.redhat.com/articles/2025/02/06/how-mig-maximizes-gpu-efficiency-openshift-ai#the_nvidia_mig_solution_and_test" rel="alternate"></link><published>2025-02-07T09:01:08.243000Z</published><id>https://developers.redhat.com/articles/2025/02/06/how-mig-maximizes-gpu-efficiency-openshift-ai#the_nvidia_mig_solution_and_test</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/how-mig-maximizes-gp/0:fb41cd"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;p&gt;Modern data science workloads demand high computational power, and Graphic Processing Units (GPUs) are often at the heart of these operations. However, sharing GPU resources efficiently among multiple users or workloads can be challenging. &lt;a class="external" href="https://www.nvidia.com/en-us/technologies/multi-instance-gpu/" rel="nofollow"&gt;NVIDIA Multi-Instance GPU&lt;/a&gt; (MIG) technology offers a solution. This article explores how I tested MIG on &lt;a class="external" href="https://developers.redhat.com/products/red-hat-openshift-ai/overview" rel="nofollow"&gt;Red Hat OpenShift AI&lt;/a&gt; using an NVIDIA Ampere architecture GPU and the benefits for AI and data science teams.&lt;/p&gt;&lt;h2&gt;The NVIDIA MIG solution and test&lt;/h2&gt;&lt;p&gt;GPUs in a &lt;a class="external" href="https://developers.redhat.com/topics/kubernetes/" rel="nofollow"&gt;Kubernetes&lt;/a&gt; environment are assigned to pods in a 1:1 ratio by default. This means a single GPU is dedicated to one pod, regardless of whether the workload fully utilizes the GPU’s capacity. This limitation can lead to inefficient resource usage, especially for smaller workloads. NVIDIA MIG solves this issue by splitting a single GPU into multiple independent instances to be used by different pods. This feature maximizes GPU utilization and ensures resources are not wasted. In the next sections, I will demonstrate how I tested MIG on Red Hat OpenShift AI.&lt;/p&gt;&lt;h3&gt;Prepare the environment&lt;/h3&gt;&lt;p&gt;For this test, certain preparatory steps are required to leverage MIG on OpenShift. I used Azure’s &lt;code&gt;Standard_NC24ads_A100_v4&lt;/code&gt; virtual machine (VM), equipped with an NVIDIA A100 PCIe 80GB GPU as an OpenShift worker (Figure 1).&lt;/p&gt;
&lt;h4&gt;Step 1: Install NFD&lt;/h4&gt;&lt;p&gt;First, I installed the Node Feature Discovery (NFD) operator, as shown in Figures 2 and 3.&lt;/p&gt;
&lt;p&gt;This operator detects hardware features and ensures that GPUs are discoverable by the NVIDIA GPU operator.&lt;/p&gt;
&lt;p&gt;We will see many labels added to the node, indicating the operator detects its GPU:&lt;/p&gt;&lt;div&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ oc describe node/ods-cluster-mqt7l-worker-eastus2-fn5w8
                        Labels:             beta.kubernetes.io/arch=amd64
                                          feature.node.kubernetes.io/cpu-cpuid.ADX=true
                                          feature.node.kubernetes.io/cpu-cpuid.AESNI=true
                                          ...
                                          feature.node.kubernetes.io/cpu-cpuid.FMA3=true
                                          feature.node.kubernetes.io/gpu.present=true
                                          feature.node.kubernetes.io/gpu.memory=80GB
                                          feature.node.kubernetes.io/gpu.vendor=nvidia
                                          feature.node.kubernetes.io/gpu.model=A100&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h4&gt;Step 2: Install the NVIDIA GPU operator&lt;/h4&gt;&lt;p&gt;Next, I installed the NVIDIA GPU operator, which handles the configuration of GPU resources (Figure 4).&lt;/p&gt;
&lt;p&gt;I made sure to enable the MIG manager in the ClusterPolicy configuration to facilitate the MIG setup (Figure 5).&lt;/p&gt;
&lt;h4&gt;Step 3: Check the pods&lt;/h4&gt;&lt;p&gt;There are two ways to make sure all pods under the &lt;code&gt;nvidia-gpu-operator&lt;/code&gt; namespace are up and running:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;From the CLI:&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ oc get pods -n nvidia-gpu-operator&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;&lt;li&gt;From the console, as shown in Figure 6:&lt;/li&gt;&lt;/ol&gt;
&lt;h3&gt;Choose the right MIG configuration&lt;/h3&gt;&lt;p&gt;MIG offers a variety of configurations tailored to different GPU models and workload requirements. You have to understand which configurations are supported for the NVIDIA A100–80GB GPU. For example, I ran the command &lt;code&gt;oc describe configmap/default-mig-parted-config&lt;/code&gt;, explored the available configurations, and selected one that matched my &lt;code&gt;requirements.1g.10gb&lt;/code&gt;, which divides the GPU into seven instances.&lt;/p&gt;&lt;p&gt;The following configuration is ideal for workloads that require smaller, dedicated slices of GPU power.&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;    # H100-80GB, H800-80GB, A100-80GB, A800-80GB, A100-40GB, A800-40GB
     all-1g.10gb:
       # H100-80GB, H800-80GB, A100-80GB, A800-80GB
       - device-filter: ["0x233010DE", "0x233110DE", "0x232210DE", "0x20B210DE", "0x20B510DE", "0x20F310DE", "0x20F510DE", "0x232410DE"]
         devices: all
         mig-enabled: true
         mig-devices:
           "1g.10gb": 7&lt;/code&gt;&lt;/pre&gt;&lt;h3&gt;Enable and verify MIG&lt;/h3&gt;&lt;p&gt;To verify the setup, I used the &lt;code&gt;nvidia-smi&lt;/code&gt; tool to query the GPU status and configurations. When MIG was initially disabled, I enabled it and restarted the node:&lt;/p&gt;&lt;div&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;sh-4.4# nvidia-smi -i 0 -mig 1
                        Enabled MIG Mode for GPU 00000001:00:00.0
                        All done.&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;To verify that MIG is enabled for the GPU, I connected to the &lt;code&gt;nvidia-mig-manager&lt;/code&gt; pod in OpenShift and used the terminal tab to query &lt;code&gt;GPU=0&lt;/code&gt; configurations with the following command:&lt;/p&gt;&lt;div&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;sh-4.4#
                        sh-4.4# nvidia-smi -i 0 -q
                        ==============NVSMI LOG==============
                        Timestamp                           : Tue Dec  5 15:41:13 2023
                        Driver Version                      : 535.104.12
                        CUDA Version                        : Not Found
                        Attached GPUs                       : 1
                        GPU 00000001:00:00.0
                            Product Name                    : NVIDIA A100 80GB PCIe
                            Product Brand                   : NVIDIA
                            Product Architecture            : Ampere
                            Display Mode                    : Enabled
                            Display Active                  : Disabled
                            Persistence Mode                : Enabled
                            Addressing Mode                 : None
                            MIG Mode
                                Current                     : Enabled
                                Pending                     : Enabled&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;After selecting the configuration, I labeled the node with the following command:&lt;/p&gt;&lt;div&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ oc label node &amp;lt;node-name&amp;gt; nvidia.com/mig.config=all-1g.10gb --overwrite&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The MIG manager pod logs insights into the status of the node labeling process (Figure 7).&lt;/p&gt;
&lt;p&gt;Once successful, the node reported multiple allocatable GPUs instead of a single one.&lt;/p&gt;&lt;p&gt;Let's describe the node to confirm that it recognizes seven GPUs:&lt;/p&gt;&lt;div&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ oc describe node/ods-cluster-mqt7l-worker-eastus2-fn5w8
                        Capacity:
                          attachable-volumes-azure-disk: 8
                          cpu: 24
                          ephemeral-storage: 133682156Ki
                          hugepages-1Gi: 0
                          hugepages-2Mi: 0
                          memory: 226965748Ki
                          nvidia.com/gpu: 7
                          pods: 250
                        Allocatable:
                          attachable-volumes-azure-disk: 8
                          cpu: 23500m
                          ephemeral-storage: 122127732942
                          hugepages-1Gi: 0
                          hugepages-2Mi: 0
                          memory: 225814772Ki
                          nvidia.com/gpu: 7
                          pods: 250&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3&gt;Consume the sliced GPUs via Red Hat OpenShift AI&lt;/h3&gt;&lt;p&gt;With MIG enabled, the OpenShift AI dashboard reflected the increased availability of GPU resources. I could select up to seven GPUs for my workbench (Figure 8). This setup empowers AI and data science teams to run diverse workloads simultaneously without bottlenecks.&lt;/p&gt;
&lt;h2&gt;Unlock GPU potential with NVIDIA MIG and OpenShift AI&lt;/h2&gt;&lt;p&gt;NVIDIA MIG technology, integrated with Red Hat OpenShift AI, transforms GPU resource management by facilitating scalable and efficient workloads. By partitioning GPUs into smaller, independent units, organizations can achieve maximum resource utilization, cost savings, and streamlined &lt;a class="external" href="https://developers.redhat.com/topics/ai-ml" rel="nofollow"&gt;AI/ML&lt;/a&gt; operations. MIG on OpenShift AI helps teams fully harness the power of GPU technology, whether they manage diverse workloads or scale multi-user environments.&lt;/p&gt;&lt;p&gt;Learn more about &lt;a class="external" href="https://developers.redhat.com/articles/2024/11/12/generative-ai-nvidia-nim-openshift-ai" rel="nofollow"&gt;using NVIDIA NIM on Red Hat OpenShift AI&lt;/a&gt; and the&lt;a class="external" href="https://www.redhat.com/en/blog/sharing-caring-how-make-most-your-gpus-part-2-multi-instance-gpu" rel="nofollow"&gt; performance results&lt;/a&gt; shown by Red Hat AI Performance and Scale when testing NVIDIA GPUs with MIG.&lt;/p&gt;</summary></entry><entry><title>Dumping packets from anywhere in the networking stack | Red Hat Developer</title><link href="https://developers.redhat.com/articles/2025/01/09/dumping-packets-anywhere-networking-stack" rel="alternate"></link><published>2025-01-17T14:46:04.358000Z</published><id>https://developers.redhat.com/articles/2025/01/09/dumping-packets-anywhere-networking-stack</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/dumping-packets-from/0:62ae6d"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;div class="article-content pf-c-content pf-l-grid__item rhd-c-fetch-article-toc"&gt; &lt;p&gt;Dumping traffic on a network interface is one of the most performed steps while debugging networking and connectivity issues. On &lt;a class="external" href="https://developers.redhat.com/topics/linux/" rel="nofollow"&gt;Linux&lt;/a&gt;, &lt;a class="external" href="https://www.tcpdump.org/" rel="nofollow"&gt;tcpdump&lt;/a&gt; is probably the most common way to do this, but some use &lt;a class="external" href="https://www.wireshark.org/" rel="nofollow"&gt;Wireshark&lt;/a&gt; too.&lt;/p&gt;&lt;h2&gt;Where does tcpdump get the packets from?&lt;/h2&gt;&lt;p&gt;Internally, both &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt; use the Packet Capture (&lt;code&gt;pcap&lt;/code&gt;) library. When capturing packets, a socket with the &lt;code&gt;PF_PACKET&lt;/code&gt; domain is created (see &lt;code&gt;man packet&lt;/code&gt;) which allows you to receive and send packets at the layer 2 from the &lt;a class="external" href="https://en.wikipedia.org/wiki/OSI_model" rel="nofollow"&gt;OSI model&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;From &lt;a class="external" href="https://github.com/the-tcpdump-group/libpcap" rel="nofollow"&gt;libpcap&lt;/a&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;sock_fd = is_any_device ?
       socket(PF_PACKET, SOCK_DGRAM, 0) :
       socket(PF_PACKET, SOCK_RAW, 0);&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Note that the last parameter in the socket call is later set to a specific protocol, or &lt;code&gt;ETH_P_ALL&lt;/code&gt; if none is explicitly provided. The latter makes all packets to be received by the socket.&lt;/p&gt;&lt;p&gt;This allows to get packets directly after the device driver in ingress, without any change being made to the packet, and right before entering the device driver on egress. Or to say it differently packets are seen between the networking stack and the NIC drivers.&lt;/p&gt;&lt;h2&gt;Limitations&lt;/h2&gt;&lt;p&gt;While the above use of &lt;code&gt;PF_PACKET&lt;/code&gt; works nicely, it also comes with limitations. As packets are retrieved from a very specific and defined place of the networking stack, they can only be seen in the state they were at that point, e.g., on ingress packets are seen before being processed by the firewall or qdiscs, and the opposite is true on egress.&lt;/p&gt;&lt;h2&gt;Offline analysis&lt;/h2&gt;&lt;p&gt;By default, &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt; process packets live at runtime. But they can also store the captured packets data to a file for later analysis (&lt;code&gt;-w&lt;/code&gt; option for &lt;code&gt;tcpdump&lt;/code&gt;). The &lt;code&gt;pcap&lt;/code&gt; file format (&lt;code&gt;application/vnd.tcpdump.pcap&lt;/code&gt;) is used. Both tools (and others, e.g., &lt;a class="external" href="https://tshark.dev/" rel="nofollow"&gt;tshark&lt;/a&gt;), support reading &lt;code&gt;pcap&lt;/code&gt; formatted files.&lt;/p&gt;&lt;h2&gt;How to capture packets from other places?&lt;/h2&gt;&lt;p&gt;Retrieving packets from other places of the networking stack using &lt;code&gt;tcpdump&lt;/code&gt; or &lt;code&gt;Wireshark&lt;/code&gt; is not possible. However, other initiatives emerged and targeted monitoring traffic within a single host, like &lt;a class="external" href="https://github.com/retis-org/retis" rel="nofollow"&gt;Retis&lt;/a&gt; (&lt;a class="external" href="https://retis.readthedocs.io" rel="nofollow"&gt;documentation&lt;/a&gt;).&lt;/p&gt;&lt;p&gt;Retis is a recently released tool aiming at improving visibility into the Linux networking stack and various control and data paths. It allows capturing networking-related events and providing relevant context using eBPF, with one notable feature being capturing packets on any (packet-aware—AKA socket buffer) kernel function and tracepoint.&lt;/p&gt;&lt;p&gt;To capture packets from the &lt;code&gt;net:netif_receive_skb&lt;/code&gt; &lt;a class="external" href="https://docs.kernel.org/trace/tracepoints.html" rel="nofollow"&gt;tracepoint&lt;/a&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ retis collect -c skb -p net:netif_receive_skb
4 probe(s) loaded
4581128037918 (8) [irq/188-iwlwifi] 1264 [tp] net:netif_receive_skb
 if 4 (wlp82s0) 2606:4700:4700::1111.53 &amp;gt; [redacted].34952 ttl 54 label 0x66967 len 79 proto UDP (17) len 71&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Note that Retis can capture packets from multiple functions and tracepoints by using the above &lt;code&gt;-p&lt;/code&gt; option multiple times. It can even identify packets and reconstruct their flow! To get a list of compatible functions and tracepoints, use &lt;code&gt;retis inspect -p&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Also it should be noted that by default &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt; put devices on promiscuous mode when dumping packets from a specific interface. This is not the case with Retis. An interface can be set in this mode manually by using &lt;code&gt;ip link set &amp;lt;interface&amp;gt; promisc on&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;In addition to the above, another tool provides a way to capture packets and convert them to a &lt;code&gt;pcap&lt;/code&gt; file: bpftrace. It is a wonderful tool but is more low-level and requires to you write the probe definitions by hand and for compilation of the BPF program to take place on the target. Here the &lt;code&gt;skboutput&lt;/code&gt; function can be used, as &lt;a class="external" href="https://github.com/bpftrace/bpftrace/blob/v0.21.2/man/adoc/bpftrace.adoc#functions-skboutput" rel="nofollow"&gt;shown in the help&lt;/a&gt;.&lt;/p&gt;&lt;h2&gt;Making the link&lt;/h2&gt;&lt;p&gt;That's nice, but while Retis is a powerful tool when used standalone, we might want to use the existing &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt; tools but with packets captured from other places of the networking stack.&lt;/p&gt;&lt;p&gt;This can be done by using the Retis &lt;code&gt;pcap&lt;/code&gt; post-processing command. This works in two steps: first Retis can capture and store packets, and then post-process them. The &lt;code&gt;pcap&lt;/code&gt; sub-command allows converting Retis saved packets to a &lt;code&gt;pcap&lt;/code&gt; format. This can then be used to feed existing &lt;code&gt;pcap&lt;/code&gt;-aware tools, such as &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ retis collect -c skb -p net:netif_receive_skb -p net:net_dev_start_xmit -o
$ retis print
4581115688645 (9) [isc-net-0000] 12796/12797 [tp] net:net_dev_start_xmit
 if 4 (wlp82s0) [redacted].34952 &amp;gt; 2606:4700:4700::1111.53 ttl 64 label 0x79c62 len 59 proto UDP (17) len 51
4581128037918 (8) [irq/188-iwlwifi] 1264 [tp] net:netif_receive_skb
 if 4 (wlp82s0) 2606:4700:4700::1111.53 &amp;gt; [redacted].34952 ttl 54 label 0x66967 len 79 proto UDP (17) len 71

$ retis pcap --probe net:net_dev_start_xmit | tcpdump -nnr -
01:31:55.688645 IP6 [redacted].34952 &amp;gt; 2606:4700:4700::1111.53: 28074+ [1au] A? &amp;lt;a href="http://redhat.com" rel="nofollow"&amp;gt;redhat.com&amp;lt;/a&amp;gt;. (51)

$ retis pcap --probe net:netif_receive_skb -o retis.pcap
$ wireshark retis.pcap&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As seen above, Retis can collect packets from multiple probes during the same session. All packets seen on a given probe can then be filtered and converted to the &lt;code&gt;pcap&lt;/code&gt; format.&lt;/p&gt;&lt;p&gt;When generating &lt;code&gt;pcap&lt;/code&gt; files, Retis adds a comment in every packet with a description of the probe the packet was retrieved on:&lt;/p&gt;&lt;pre&gt;&lt;code class="language-plaintext"&gt;$ capinfos -p retis.pcap
File name:           retis.pcap
Packet 1 Comment:    probe=raw_tracepoint:net:netif_receive_skb&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In many cases, tools like &lt;code&gt;tcpdump&lt;/code&gt; and &lt;code&gt;Wireshark&lt;/code&gt; are sufficient. But, due to their design, they can only dump packets from a very specific place of the networking stack, which in some cases can be limiting. When that's the case it's possible to use more recent tools like Retis, either standalone or in combination with the beloved pcap aware utilities to allow using familiar tools or easily integrate this into existing scripts.&lt;/p&gt; &lt;/div&gt;&lt;/div&gt;</summary></entry><entry><title>Red Hat OpenStack Services on OpenShift: Rethinking storage design in pod-based architectures</title><link href="https://www.redhat.com/en/blog/red-hat-openstack-services-on-openshift-rethinking-storage-design-pod-based-architectures" rel="alternate"></link><published>2025-01-14T10:45:52.764000Z</published><id>https://www.redhat.com/en/blog/red-hat-openstack-services-on-openshift-rethinking-storage-design-pod-based-architectures</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/red-hat-openstack-se/5741853:550c59"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/5741853.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Red Hat Blog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="rh-generic--component"&gt; &lt;p&gt;With the release of&lt;a class="external" href="/en/about/press-releases/red-hat-openstack-services-openshift-now-generally-available" rel="nofollow"&gt; Red Hat OpenStack Services on OpenShift&lt;/a&gt;, there is a major change in the design and architecture that impacts how OpenStack is deployed and managed. The OpenStack control plane has moved from traditional standalone containers on &lt;a class="external" href="/en/technologies/linux-platforms/enterprise-linux" rel="nofollow"&gt;Red Hat Enterprise Linux&lt;/a&gt; (RHEL) to an advanced pod-based&lt;a class="external" href="/en/topics/containers/what-is-kubernetes" rel="nofollow"&gt; Kubernetes &lt;/a&gt;managed architecture.&lt;/p&gt;&lt;h2&gt;Introducing Red Hat OpenStack Services on OpenShift&lt;/h2&gt;&lt;p&gt;In this new form factor, the OpenStack control services such as keystone, nova, glance and neutron that were once deployed as standalone containers on top of bare metal or &lt;a class="external" href="/en/topics/virtualization/what-is-a-virtual-machine" rel="nofollow"&gt;virtual machines &lt;/a&gt;(VMs) are now deployed as native &lt;a class="external" href="/en/technologies/cloud-computing/openshift" rel="nofollow"&gt;Red Hat OpenShift&lt;/a&gt; pods leveraging the flexibility, placement, abstraction and scalability of Kubernetes orchestration&lt;/p&gt;&lt;p&gt;The OpenStack compute nodes that are running VMs are still relying on RHEL, with the difference being that it is provisioned by Metal3 and configured by an OpenShift operator using &lt;a class="external" href="/en/technologies/management/ansible" rel="nofollow"&gt;Red Hat Ansible Automation Platform&lt;/a&gt; behind the scenes. It is worth noting that it’s still possible to bring preprovisioned nodes with RHEL pre-install.&lt;/p&gt;&lt;h2&gt;New approach, new storage considerations&lt;/h2&gt;&lt;p&gt;Deploying and managing the OpenStack control plane on top of OpenShift brings several new advantages, but it also comes with new storage considerations.&lt;/p&gt;&lt;p&gt;Previously, the OpenStack control plane was deployed as three “controllers” which usually took form as bare metal servers or, in some cases, VMs.&lt;/p&gt;&lt;p&gt;In terms of storage, the OpenStack control services used the server’s local disk(s) to write persistent data (or a network storage backend when booting from your storage area network (SAN)).&lt;/p&gt;&lt;p&gt;With the shift to a native OpenShift approach, the OpenStack control services are dynamically scheduled across OpenShift workers as pods. This approach introduces a number of benefits, but the default pod storage option is to use ephemeral storage. Ephemeral storage is perfectly fine for stateless services such as the service’s API, but not appropriate for services that require persistent data such as the control plane database. When a pod restarts or terminates, it must get its data back.&lt;/p&gt;&lt;p&gt;Fortunately, OpenShift provides a &lt;a class="external" href="https://docs.openshift.com/container-platform/latest/storage/understanding-persistent-storage.html" rel="nofollow"&gt;persistent storage abstraction layer&lt;/a&gt; in the form of “Persistent Volumes” (PV) and “Persistent Volume Claim” (PVC) that enable pods to mount volumes that persist across pod’s lifecycle. This persistent storage framework is tightly coupled with another standard called &lt;a class="external" href="https://docs.openshift.com/container-platform/latest/storage/container_storage_interface/persistent-storage-csi.html" rel="nofollow"&gt;Container Storage Interface&lt;/a&gt; (CSI) that allows OpenShift to provision volumes from a variety of storage backends should the storage vendor provide a certified CSI Driver.&lt;/p&gt; &lt;a class="rhdc-media__image-link" href="/rhdc/managed-files/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%2C%20high%20level%20design_0.png" rel="nofollow"&gt; &lt;img alt="Red Hat OpenStack Services on OpenShift, high level design" src="https://www.redhat.com/rhdc/managed-files/styles/wysiwyg_full_width/private/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%2C%20high%20level%20design_0.png.webp?itok=0BVD-rYM" width="1170"/&gt; &lt;/a&gt; &lt;p&gt;&lt;br/&gt;This is where the paradigm changes, in previous versions of Red Hat OpenStack, the control services' persistent data were stored on local controllers disks and no further design decisions were needed besides the size, type, performance and RAID level of the disks.&lt;/p&gt;&lt;p&gt;With OpenStack Services on OpenShift, a storage solution must also be considered for OpenShift alongside the traditional OpenStack storage.&lt;/p&gt;&lt;p&gt;In this article, we dive into the main available options to back OpenShift and OpenStack data for environments that are using Ceph or third-party storage solutions.&lt;/p&gt;&lt;p&gt;Before we get into the details, you may wonder which OpenStack control services need persistent storage:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Glance for the staging area and optional cache&lt;/li&gt;&lt;li&gt;Galera for storing the database&lt;/li&gt;&lt;li&gt;OVN Northbound and Southbound database&lt;/li&gt;&lt;li&gt;RabbitMQ for storing the queues&lt;/li&gt;&lt;li&gt;Swift for storing object data when not using external physical nodes&lt;/li&gt;&lt;li&gt;Telemetry for storing metrics&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Red Hat OpenStack Services on OpenShift with Red Hat Ceph Storage&lt;/h2&gt;&lt;p&gt;Ceph is a well known and widely used storage backend for OpenStack. It can serve block with Nova, Glance and Cinder, file with Manila, and object with S3/SWIFT APIs.&lt;/p&gt;&lt;p&gt;The integration between OpenStack Services on OpenShift and Ceph is the same as previous OpenStack versions—block is served by RADOS block devices (RBD), file by CephFS or network file system (NFS) and object by S3 or SWIFT.&lt;/p&gt;&lt;p&gt;The different OpenStack services are configured to connect to the Ceph cluster, but what changes is the way you configure it at install time, as we are now using native Kubernetes Custom Resources Definition (CDR) instead of TripleO templates as in previous versions.&lt;/p&gt;&lt;p&gt;The main design change is how to serve OpenShift volumes.&lt;/p&gt;&lt;h3&gt;Using Ceph across both platforms&lt;/h3&gt;&lt;p&gt;The first option is to use the same external Ceph cluster between OpenStack and OpenShift, consolidating the Ceph investment by sharing the storage resources.&lt;/p&gt; &lt;a class="rhdc-media__image-link" href="/rhdc/managed-files/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20shared%20Ceph%20cluster_0.png" rel="nofollow"&gt; &lt;img alt="Red Hat OpenStack Services on OpenShift design with shared Ceph cluster" src="https://www.redhat.com/rhdc/managed-files/styles/wysiwyg_full_width/private/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20shared%20Ceph%20cluster_0.png.webp?itok=SF5pEWDC" width="1170"/&gt; &lt;/a&gt; &lt;p&gt;&lt;br/&gt; In the above diagram, OpenStack is consuming Ceph as usual, and OpenShift uses OpenShift Data Foundation (ODF) external mode to connect to the same cluster. &lt;a class="external" href="https://docs.redhat.com/en/documentation/red_hat_openshift_data_foundation/4.16/html/deploying_openshift_data_foundation_in_external_mode/index" rel="nofollow"&gt;ODF external&lt;/a&gt; deploys the Ceph CSI drivers that allow OpenShift to provision persistent volumes from a Ceph cluster.&lt;/p&gt;&lt;p&gt;OpenStack and OpenShift use different Ceph pools and keys, but architects should review their cluster’s capacity and performance to anticipate any potential impact. It’s also possible to isolate the storage I/O of both platforms by customizing the CRUSH map and allowing data to be stored on different object storage daemons (OSDs).&lt;/p&gt;&lt;p&gt;The design outlined above shares the same Ceph cluster between OpenShift and OpenStack but they can be different clusters based on the use case.&lt;/p&gt;&lt;h3&gt;Third-party or local storage for OpenShift and Ceph for OpenStack&lt;/h3&gt;&lt;p&gt;In some cases, you do not want to share OpenStack and OpenShift data on the same cluster. As mentioned before, it’s possible to use another Ceph cluster but the capacity needed for the control plane services may not be enough to justify it.&lt;/p&gt;&lt;p&gt;Another option is to leverage OpenShift’s workers' local disks. To do so, OpenShift includes an out-of-the-box logical volume manager (LVM) based CSI operator called LVM Storage (&lt;a class="external" href="https://docs.openshift.com/container-platform/4.16/storage/persistent_storage/persistent_storage_local/persistent-storage-using-lvms.html" rel="nofollow"&gt;LVMS&lt;/a&gt;). LVMS allows dynamic local provisioning of the persistent volumes via LVM on the workers' local disks. This has the advantage of using local direct disk performance at a minimum cost.&lt;/p&gt;&lt;p&gt;On the other hand, if the data being local to the worker, the pods relying on volumes cannot be evacuated to other workers. This is a limitation to consider, especially if OpenStack control services are deployed on more than three workers.&lt;/p&gt;&lt;p&gt;It is also possible to rely on an existing third-party backend using a certified CSI driver which would remove the 1:1 pinning between the pod and the volume but can increase the cost. Using ODF internally as an OpenShift storage solution is also an option.&lt;/p&gt;&lt;p&gt;The OpenStack integration to Ceph remains the same.&lt;/p&gt; &lt;a class="rhdc-media__image-link" href="/rhdc/managed-files/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20Ceph%20cluster%20for%20OpenStack%20and%20a.png" rel="nofollow"&gt; &lt;img alt="Red Hat OpenStack Services on OpenShift design with Ceph cluster for OpenStack and alternative solution for OpenShift" src="https://www.redhat.com/rhdc/managed-files/styles/wysiwyg_full_width/private/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20Ceph%20cluster%20for%20OpenStack%20and%20a.png.webp?itok=SVbgq5s9" width="1170"/&gt; &lt;/a&gt; &lt;h3&gt;OpenStack with Ceph hyper-converged&lt;/h3&gt;&lt;p&gt;Deploying Ceph hyper-converged with OpenStack compute nodes is a popular solution to combine both compute and storage resources on the same hardware, reducing the cost and hardware footprint.&lt;/p&gt;&lt;p&gt;The integration with Ceph does not differ from an external Ceph besides the fact that the compute and storage services are collocated.&lt;/p&gt;&lt;p&gt;The OpenShift storage options are more limited, however, as it is not possible to use the hyper-converged Ceph cluster to back OpenShift persistent volumes.&lt;/p&gt;&lt;p&gt;The options are the same as those outlined in the previous section—OpenShift can rely on LVMS to leverage the local worker disks or use an existing third-party backend with a certified CSI driver.&lt;/p&gt; &lt;a class="rhdc-media__image-link" href="/rhdc/managed-files/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20Ceph%20HyperConverged%20for%20OpenStack.png" rel="nofollow"&gt; &lt;img alt="Red Hat OpenStack Services on OpenShift design with Ceph HyperConverged for OpenStack" src="https://www.redhat.com/rhdc/managed-files/styles/wysiwyg_full_width/private/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20Ceph%20HyperConverged%20for%20OpenStack.png.webp?itok=tWObEvkY" width="1170"/&gt; &lt;/a&gt; &lt;h3&gt;OpenStack with third-party storage solutions&lt;/h3&gt;&lt;p&gt;For environments that are not using Ceph, the same principle applies. The OpenStack integration does not change, the control and compute services are configured to use an external shared storage backend through iSCSI, FC, NFS, NVMe/TCP or other vendor-specific protocols. Cinder and Manila drivers are still used to integrate the storage solution with OpenStack.&lt;/p&gt;&lt;p&gt;On the OpenShift side, the options are to either use LVMS to leverage the local worker disks or use an existing third-party backend with a certified CSI driver. This third-party backend can be the same as the one used for OpenStack or a different one.&lt;/p&gt; &lt;a class="rhdc-media__image-link" href="/rhdc/managed-files/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20third%20party%20storage_0.png" rel="nofollow"&gt; &lt;img alt="Red Hat OpenStack Services on OpenShift design with third party storage" src="https://www.redhat.com/rhdc/managed-files/styles/wysiwyg_full_width/private/Red%20Hat%20OpenStack%20Services%20on%20OpenShift%20design%20with%20third%20party%20storage_0.png.webp?itok=rVaRerCg" width="1170"/&gt; &lt;/a&gt; &lt;h2&gt;Wrap up&lt;/h2&gt;&lt;p&gt;As Red Hat OpenStack moves to a more modern OpenShift-based deployment model, new storage systems need to be considered. Red Hat OpenStack Services on OpenShift offers a broad set of options for storing the OpenStack control services and the end user’s data. Whether you’re using Ceph or not, and whether you want shared storage or to rely on local disks, the different supported combinations will match a vast set of use cases and requirements.&lt;/p&gt;&lt;p&gt;For more details on Red Hat OpenStack Services on OpenShift storage integration, please refer to our &lt;a class="external" href="https://docs.redhat.com/en/documentation/red_hat_openstack_services_on_openshift/18.0/html/planning_your_deployment/index" rel="nofollow"&gt;planning guide&lt;/a&gt;. &lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>How To Create Multi-Step Forms With Vanilla JavaScript And CSS</title><link href="https://css-tricks.com/how-to-create-multi-step-forms-with-vanilla-javascript-and-css/" rel="alternate"></link><published>2024-12-18T17:25:24.039000Z</published><id>https://css-tricks.com/how-to-create-multi-step-forms-with-vanilla-javascript-and-css/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/how-to-create-multi-/9536825:7511da"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/9536825.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Comments on: How to Create Multi-Step Forms With Vanilla JavaScript and CSS.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="article-content"&gt; &lt;p&gt;Multi-step forms are a good choice when your form is large and has many controls. No one wants to scroll through a super-long form on a mobile device. By grouping controls on a screen-by-screen basis, we can improve the experience of filling out long, complex forms.&lt;/p&gt; &lt;p&gt;But when was the last time you developed a multi-step form? Does that even sound fun to you? There’s so much to think about and so many moving pieces that need to be managed that I wouldn’t blame you for resorting to a form library or even some type of form widget that handles it all for you.&lt;/p&gt; &lt;p&gt;But doing it by hand can be a good exercise and a great way to polish the basics. I’ll show you how I built my first multi-step form, and I hope you’ll not only see how approachable it can be but maybe even spot areas to make my work even better.&lt;/p&gt; &lt;p&gt;We’ll walk through the structure together. We’ll build a job application, which I think many of us can relate to these recent days. I’ll scaffold the baseline HTML, CSS, and JavaScript first, and then we’ll look at considerations for accessibility and validation.&lt;/p&gt; &lt;span&gt;&lt;/span&gt; &lt;p&gt;I’ve created a &lt;a class="external" href="https://github.com/FatumaA/mulit-step-form/" rel="nofollow"&gt;GitHub repo for the final code&lt;/a&gt; if you want to refer to it along the way.&lt;/p&gt; &lt;p&gt;Our job application form has four sections, the last of which is a summary view, where we show the user all their answers before they submit them. To achieve this, we divide the HTML into four sections, each identified with an ID, and add navigation at the bottom of the page. I’ll give you that baseline HTML in the next section.&lt;/p&gt; &lt;p&gt;Navigating the user to move through sections means we’ll also include a visual indicator for what step they are at and how many steps are left. This indicator can be a simple dynamic text that updates according to the active step or a fancier progress bar type of indicator. We’ll do the former to keep things simple and focused on the multi-step nature of the form.,&lt;/p&gt; &lt;p&gt;We’ll focus more on the logic, but I will provide the code snippets and a link to the complete code at the end.&lt;/p&gt; &lt;p&gt;Let’s start by creating a folder to hold our pages. Then, create an &lt;code&gt;index.html&lt;/code&gt; file and paste the following into it:&lt;/p&gt;  &lt;p&gt;Looking at the code, you can see three sections and the navigation group. The sections contain form inputs and no native form validation. This is to give us better control of displaying the error messages because native form validation is only triggered when you click the submit button.&lt;/p&gt; &lt;p&gt;Next, create a &lt;code&gt;styles.css&lt;/code&gt; file and paste this into it:&lt;/p&gt;  &lt;p&gt;Open up the HTML file in the browser, and you should get something like the two-column layout in the following screenshot, complete with the current page indicator and navigation.&lt;/p&gt;  &lt;p&gt;Now, create a &lt;code&gt;script.js&lt;/code&gt; file in the same directory as the HTML and CSS files and paste the following JavaScript into it:&lt;/p&gt;  &lt;p&gt;This script defines a method that shows and hides the section depending on the &lt;code&gt;formStep&lt;/code&gt; values that correspond to the IDs of the form sections. It updates &lt;code&gt;stepInfo&lt;/code&gt; with the current active section of the form. This dynamic text acts as a progress indicator to the user.&lt;/p&gt; &lt;p&gt;It then adds logic that waits for the page to load and click events to the navigation buttons to enable cycling through the different form sections. If you refresh your page, you will see that the multi-step form works as expected.&lt;/p&gt; &lt;p&gt;Let’s dive deeper into what the Javascript code above is doing. In the &lt;code&gt;updateStepVisibility()&lt;/code&gt; function, we first hide all the sections to have a clean slate:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;formSteps.forEach((step) =&amp;gt; {
  document.getElementById(step).style.display = "none";
});&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then, we show the currently active section:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;document.getElementById(formSteps[currentStep]).style.display = "block";`&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Next, we update the text that indicators progress through the form:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;stepInfo.textContent = `Step ${currentStep + 1} of ${formSteps.length}`;&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Finally, we hide the Previous button if we are at the first step and hide the Next button if we are at the last section:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;navLeft.style.display = currentStep === 0 ? "none" : "block";
navRight.style.display = currentStep === formSteps.length - 1 ? "none" : "block";&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Let’s look at what happens when the page loads. We first hide the Previous button as the form loads on the first section:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;document.addEventListener("DOMContentLoaded", () =&amp;gt; {
navLeft.style.display = "none";
updateStepVisibility();&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then we grab the Next button and add a click event that conditionally increments the current step count and then calls the &lt;code&gt;updateStepVisibility()&lt;/code&gt; function, which then updates the new section to be displayed:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;navRight.addEventListener("click", () =&amp;gt; {
  if (currentStep &amp;lt; formSteps.length - 1) {
    currentStep++;
    updateStepVisibility();
  }
});&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Finally, we grab the Previous button and do the same thing but in reverse. Here, we are conditionally decrementing the step count and calling the &lt;code&gt;updateStepVisibility()&lt;/code&gt;:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;navLeft.addEventListener("click", () =&amp;gt; {
  if (currentStep &amp;gt; 0) {
    currentStep--;
    updateStepVisibility();
  }
});&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Have you ever spent a good 10+ minutes filling out a form only to submit it and get vague errors telling you to correct this and that? I prefer it when a form tells me right away that something’s amiss so that I can correct it &lt;em&gt;before&lt;/em&gt; I ever get to the Submit button. That’s what we’ll do in our form.&lt;/p&gt; &lt;p&gt;Our principle is to clearly indicate which controls have errors and give meaningful error messages. Clear errors as the user takes necessary actions. Let’s add some validation to our form. First, let’s grab the necessary input elements and add this to the existing ones:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;const nameInput = document.getElementById("name");
const idNumInput = document.getElementById("idNum");
const emailInput = document.getElementById("email");
const birthdateInput = document.getElementById("birthdate")
const documentInput = document.getElementById("document");
const departmentInput = document.getElementById("department");
const termsCheckbox = document.getElementById("terms");
const skillsInput = document.getElementById("skills");&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then, add a function to validate the steps:&lt;/p&gt;  &lt;p&gt;Here, we check if each required input has some value and if the email input has a valid input. Then, we set the isValid boolean accordingly. We also call a &lt;code&gt;showError()&lt;/code&gt; function, which we haven’t defined yet.&lt;/p&gt; &lt;p&gt;Paste this code above the &lt;code&gt;validateStep()&lt;/code&gt; function:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;function showError(input, message) {
  const formControl = input.parentElement;
  const errorSpan = formControl.querySelector(".error-message");
  input.classList.add("error");
  errorSpan.textContent = message;
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Now, add the following styles to the stylesheet:&lt;/p&gt;  &lt;p&gt;If you refresh the form, you will see that the buttons do not take you to the next section till the inputs are considered valid:&lt;/p&gt;  &lt;p&gt;Finally, we want to add real-time error handling so that the errors go away when the user starts inputting the correct information. Add this function below the &lt;code&gt;validateStep()&lt;/code&gt; function:&lt;/p&gt;  &lt;p&gt;This function clears the errors if the input is no longer invalid by listening to input and change events then calling a function to clear the errors. Paste the &lt;code&gt;clearError()&lt;/code&gt; function below the &lt;code&gt;showError()&lt;/code&gt; one:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;function clearError(input) {
  const formControl = input.parentElement;
  const errorSpan = formControl.querySelector(".error-message");
  input.classList.remove("error");
  errorSpan.textContent = "";
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;And now the errors clear when the user types in the correct value:&lt;/p&gt;  &lt;p&gt;The multi-step form now handles errors gracefully. If you do decide to keep the errors till the end of the form, then at the very least, jump the user back to the erroring form control and show some indication of how many errors they need to fix.&lt;/p&gt; &lt;p&gt;In a multi-step form, it is valuable to show the user a summary of all their answers at the end before they submit and to offer them an option to edit their answers if necessary. The person can’t see the previous steps without navigating backward, so showing a summary at the last step gives assurance and a chance to correct any mistakes.&lt;/p&gt; &lt;p&gt;Let’s add a fourth section to the markup to hold this summary view and move the submit button within it. Paste this just below the third section in &lt;code&gt;index.html&lt;/code&gt;:&lt;/p&gt;  &lt;p&gt;Then update the &lt;code&gt;formStep&lt;/code&gt; in your Javascript to read:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;const formSteps = ["one", "two", "three", "four"];&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Finally, add the following classes to &lt;code&gt;styles.css&lt;/code&gt;:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-css"&gt;&lt;code&gt;.summary-section {
  display: flex;
  align-items: center;
  gap: 10px;
}

.summary-section p:first-child {
  width: 30%;
  flex-shrink: 0;
  border-right: 1px solid var(--secondary-color);
}

.summary-section p:nth-child(2) {
  width: 45%;
  flex-shrink: 0;
  padding-left: 10px;
}

.edit-btn {
  width: 25%;
  margin-left: auto;
  background-color: transparent;
  color: var(--primary-color);
  border: .7px solid var(--primary-color);
  border-radius: 5px;
  padding: 5px;
}

.edit-btn:hover {
  border: 2px solid var(--primary-color);
  font-weight: bolder;
  background-color: transparent;
}
&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Now, add the following to the top of the &lt;code&gt;script.js&lt;/code&gt; file where the other &lt;code&gt;const&lt;/code&gt;s are:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;const nameVal = document.getElementById("name-val");
const idVal = document.getElementById("id-val");
const emailVal = document.getElementById("email-val");
const bdVal = document.getElementById("bd-val")
const cvVal = document.getElementById("cv-val");
const deptVal = document.getElementById("dept-val");
const skillsVal = document.getElementById("skills-val");
const editButtons = 
  "name-edit": 0,
  "id-edit": 0,
  "email-edit": 0,
  "bd-edit": 0,
  "cv-edit": 1,
  "dept-edit": 1,
  "skills-edit": 2
};&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Then add this function in &lt;code&gt;scripts.js&lt;/code&gt;:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;function updateSummaryValues() {
  nameVal.textContent = nameInput.value;
  idVal.textContent = idNumInput.value;
  emailVal.textContent = emailInput.value;
  bdVal.textContent = birthdateInput.value;

  const fileName = documentInput.files[0]?.name;
  if (fileName) 
  const extension = fileName.split(".").pop();
  const baseName = fileName.split(".")[0];
  const truncatedName = baseName.length &amp;gt; 10 ? baseName.substring(0, 10) + "..." : baseName;
  cvVal.textContent = `${truncatedName}.${extension}`;
  } else {
    cvVal.textContent = "No file selected";
  }

  deptVal.textContent = departmentInput.value;
  skillsVal.textContent = skillsInput.value || "No skills submitted";
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;This dynamically inserts the input values into the summary section of the form, truncates the file names, and offers a fallback text for the input that was not required.&lt;/p&gt; &lt;p&gt;Then update the &lt;code&gt;updateStepVisibility()&lt;/code&gt; function to call the new function:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;function updateStepVisibility() {
  formSteps.forEach((step) =&amp;gt; {
    document.getElementById(step).style.display = "none";
  });

  document.getElementById(formSteps[currentStep]).style.display = "block";
  stepInfo.textContent = `Step ${currentStep + 1} of ${formSteps.length}`;
  if (currentStep === 3) {
    updateSummaryValues();
  }

  navLeft.style.display = currentStep === 0 ? "none" : "block";
  navRight.style.display = currentStep === formSteps.length - 1 ? "none" : "block";
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Finally, add this to the &lt;code&gt;DOMContentLoaded&lt;/code&gt; event listener:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;Object.keys(editButtons).forEach((buttonId) =&amp;gt; {
  const button = document.getElementById(buttonId);
  button.addEventListener("click", (e) =&amp;gt; {
    currentStep = editButtons[buttonId];
    updateStepVisibility();
  });
});&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Running the form, you should see that the summary section shows all the inputted values and allows the user to edit any before submitting the information:&lt;/p&gt;  &lt;p&gt;And now, we can submit our form:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;form.addEventListener("submit", (e) =&amp;gt; {
  e.preventDefault();

  if (validateStep(2)) {
    alert("Form submitted successfully!");
    form.reset();
    currentFormStep = 0;
    updateStepVisibility();
}
});&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Our multi-step form now allows the user to edit and see all the information they provide before submitting it.&lt;/p&gt; &lt;p&gt;Making multi-step forms accessible starts with the basics: &lt;strong&gt;using semantic HTML.&lt;/strong&gt; This is half the battle. It is closely followed by using appropriate form labels.&lt;/p&gt; &lt;p&gt;Other ways to make forms more accessible include giving enough room to elements that must be clicked on small screens and giving meaningful descriptions to the form navigation and progress indicators.&lt;/p&gt; &lt;p&gt;Offering feedback to the user is an important part of it; it’s not great to auto-dismiss user feedback after a certain amount of time but to allow the user to dismiss it themselves. Paying attention to contrast and font choice is important, too, as they both affect how readable your form is.&lt;/p&gt; &lt;p&gt;Let’s make the following adjustments to the markup for more technical accessibility:&lt;/p&gt; &lt;ol class="wp-block-list"&gt;
&lt;li&gt;&lt;strong&gt;Add &lt;code&gt;aria-required="true"&lt;/code&gt; to all inputs except the skills one.&lt;/strong&gt; This lets screen readers know the fields are required without relying on native validation.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Add &lt;code&gt;role="alert"&lt;/code&gt; to the error spans.&lt;/strong&gt; This helps screen readers know to give it importance when the input is in an error state.&lt;/li&gt; &lt;li&gt;&lt;strong&gt;Add &lt;code&gt;role="status" aria-live="polite"&lt;/code&gt; to the &lt;code&gt;.stepInfo&lt;/code&gt;.&lt;/strong&gt; This will help screen readers understand that the step info keeps tabs on a state, and the aria-live being set to polite indicates that should the value change, it does not need to immediately announce it.&lt;/li&gt;
&lt;/ol&gt; &lt;p&gt;In the script file, replace the &lt;code&gt;showError()&lt;/code&gt; and &lt;code&gt;clearError()&lt;/code&gt; functions with the following:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;function showError(input, message) {
  const formControl = input.parentElement;
  const errorSpan = formControl.querySelector(".error-message");
  input.classList.add("error");
  input.setAttribute("aria-invalid", "true");
  input.setAttribute("aria-describedby", errorSpan.id);
  errorSpan.textContent = message;
  }

  function clearError(input) {
  const formControl = input.parentElement;
  const errorSpan = formControl.querySelector(".error-message");
  input.classList.remove("error");
  input.removeAttribute("aria-invalid");
  input.removeAttribute("aria-describedby");
  errorSpan.textContent = "";
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;Here, we programmatically add and remove attributes that explicitly tie the input with its error span and show that it is in an invalid state.&lt;/p&gt; &lt;p&gt;Finally, let’s add focus on the first input of every section; add the following code to the end of the &lt;code&gt;updateStepVisibility()&lt;/code&gt; function:&lt;/p&gt; &lt;pre class="wp-block-csstricks-code-block language-javascript"&gt;&lt;code&gt;const currentStepElement = document.getElementById(formSteps[currentStep]);
const firstInput = currentStepElement.querySelector(
  "input, select, textarea"
);

if (firstInput) {
  firstInput.focus();
}&lt;/code&gt;&lt;/pre&gt; &lt;p&gt;And with that, the multi-step form is much more accessible.&lt;/p&gt; &lt;p&gt;There we go, a four-part multi-step form for a job application! As I said at the top of this article, there’s a lot to juggle — so much so that I wouldn’t fault you for looking for an out-of-the-box solution.&lt;/p&gt; &lt;p&gt;But if you have to hand-roll a multi-step form, hopefully now you see it’s not a death sentence. There’s a happy path that gets you there, complete with navigation and validation, without turning away from good, accessible practices.&lt;/p&gt; &lt;p&gt;And this is just how I approached it! Again, I took this on as a personal challenge to see how far I could get, and I’m pretty happy with it. But I’d love to know if you see additional opportunities to make this even more mindful of the user experience and considerate of accessibility.&lt;/p&gt; &lt;p&gt;Here are some relevant links I referred to when writing this article:&lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>seddonym/import-linter: Import Linter allows you to define and enforce rules for the internal and external imports within your Python project.</title><link href="https://github.com/seddonym/import-linter/?featured_on=talkpython" rel="alternate"></link><published>2024-12-15T09:20:29.452000Z</published><id>https://github.com/seddonym/import-linter/?featured_on=talkpython</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/seddonymimport-linte/0:68ceff"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

</summary></entry><entry><title>Publishing a simple client-side JavaScript package to npm with GitHub Actions</title><link href="https://til.simonwillison.net/npm/npm-publish-github-actions" rel="alternate"></link><published>2024-12-11T16:59:01.141000Z</published><author><name>Simon Willison</name></author><id>https://til.simonwillison.net/npm/npm-publish-github-actions</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/publishing-a-simple-/7901422:c70d48"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/7901422.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Simon Willison TIL.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;p&gt;Here's what I learned about publishing a single file JavaScript package to NPM for my &lt;a href="https://simonwillison.net/2024/Dec/7/prompts-js/" rel="nofollow"&gt;Prompts.js&lt;/a&gt; project.&lt;/p&gt;
&lt;p&gt;The code is in &lt;a href="https://github.com/simonw/prompts-js"&gt;simonw/prompts-js&lt;/a&gt; on GitHub. The NPM package is &lt;a href="https://www.npmjs.com/package/prompts-js" rel="nofollow"&gt;prompts-js&lt;/a&gt;.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;&lt;h2 class="heading-element"&gt;A simple single file client-side package&lt;/h2&gt;&lt;a class="anchor" href="https://til.simonwillison.net/tils/feed.atom#a-simple-single-file-client-side-package" id="user-content-a-simple-single-file-client-side-package"&gt;&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;For this project, I wanted to create an old-fashioned JavaScript file that you could include in a web page using a &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tag. No TypeScript, no React JSK, no additional dependencies, no build step.&lt;/p&gt;
&lt;p&gt;I also wanted to ship it to NPM, mainly so it would be magically available from various CDNs.&lt;/p&gt;
&lt;p&gt;I think I've boiled that down to about as simple as I can get. Here's the &lt;code&gt;package.json&lt;/code&gt; file:&lt;/p&gt;
&lt;div class="highlight highlight-source-json"&gt;&lt;pre&gt;{
  &lt;span class="pl-ent"&gt;"name"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;prompts-js&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"version"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;0.0.4&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"description"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;async alternatives to browser alert() and prompt() and confirm()&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"main"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;index.js&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"homepage"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;https://github.com/simonw/prompts-js&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"scripts"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"test"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;echo &lt;span class="pl-cce"&gt;\"&lt;/span&gt;Error: no test specified&lt;span class="pl-cce"&gt;\"&lt;/span&gt; &amp;amp;&amp;amp; exit 1&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  },
  &lt;span class="pl-ent"&gt;"author"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Simon Willison&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"license"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;Apache-2.0&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
  &lt;span class="pl-ent"&gt;"repository"&lt;/span&gt;: {
    &lt;span class="pl-ent"&gt;"type"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;git&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-ent"&gt;"url"&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;git+https://github.com/simonw/prompts-js.git&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  },
  &lt;span class="pl-ent"&gt;"keywords"&lt;/span&gt;: [
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;alert&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;prompt&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;confirm&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;async&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;promise&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;dialog&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  ],
  &lt;span class="pl-ent"&gt;"files"&lt;/span&gt;: [
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;index.js&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;README.md&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;,
    &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;"&lt;/span&gt;LICENSE&lt;span class="pl-pds"&gt;"&lt;/span&gt;&lt;/span&gt;
  ]
}&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That "scripts.test" block probably isn't necessary. The &lt;code&gt;keywords&lt;/code&gt; are used when you deploy to NPM, and the &lt;code&gt;files&lt;/code&gt; block tells NPM which files to include in the package.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;"repository"&lt;/code&gt; block is used by NPM's &lt;a href="https://docs.npmjs.com/generating-provenance-statements" rel="nofollow"&gt;provenance statements&lt;/a&gt;. Don't worry too much about these - they're only needed if you use the &lt;code&gt;npm publish --provenance&lt;/code&gt; option later on.&lt;/p&gt;
&lt;p&gt;Really the three most important keys here are &lt;code&gt;"name"&lt;/code&gt;, which needs to be a unique name on NPM, &lt;code&gt;"version"&lt;/code&gt; and that &lt;code&gt;"main"&lt;/code&gt; key. I set &lt;code&gt;"main"&lt;/code&gt; to &lt;code&gt;index.js&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;All that's needed now is that &lt;code&gt;index.js&lt;/code&gt; file - and optionally the &lt;code&gt;README.md&lt;/code&gt; and &lt;code&gt;LICENSE&lt;/code&gt; files if we want to include them in the package. The &lt;code&gt;README.md&lt;/code&gt; ends up displayed on the NPM listing page so it's worth including.&lt;/p&gt;
&lt;p&gt;Here's my &lt;a href="https://github.com/simonw/prompts-js/blob/main/index.js"&gt;index.js&lt;/a&gt; file. It starts and ends like this (an &lt;a href="https://developer.mozilla.org/en-US/docs/Glossary/IIFE" rel="nofollow"&gt;IFFE&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight highlight-source-js"&gt;&lt;pre&gt;&lt;span class="pl-k"&gt;const&lt;/span&gt; &lt;span class="pl-v"&gt;Prompts&lt;/span&gt; &lt;span class="pl-c1"&gt;=&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-k"&gt;function&lt;/span&gt; &lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt;
  &lt;span class="pl-c"&gt;// ...&lt;/span&gt;
  &lt;span class="pl-k"&gt;return&lt;/span&gt; &lt;span class="pl-kos"&gt;{&lt;/span&gt; alert&lt;span class="pl-kos"&gt;,&lt;/span&gt; confirm&lt;span class="pl-kos"&gt;,&lt;/span&gt; prompt &lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;
&lt;span class="pl-kos"&gt;}&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;(&lt;/span&gt;&lt;span class="pl-kos"&gt;)&lt;/span&gt;&lt;span class="pl-kos"&gt;;&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;div class="markdown-heading"&gt;&lt;h2 class="heading-element"&gt;Publishing to NPM&lt;/h2&gt;&lt;a class="anchor" href="https://til.simonwillison.net/tils/feed.atom#publishing-to-npm" id="user-content-publishing-to-npm"&gt;&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;With these pieces in place, running &lt;code&gt;npm publish&lt;/code&gt; in the root of the project will publish the package to NPM - after first asking you to sign into your NPM account.&lt;/p&gt;
&lt;div class="markdown-heading"&gt;&lt;h2 class="heading-element"&gt;Automating this with GitHub Actions&lt;/h2&gt;&lt;a class="anchor" href="https://til.simonwillison.net/tils/feed.atom#automating-this-with-github-actions" id="user-content-automating-this-with-github-actions"&gt;&lt;span class="octicon octicon-link"&gt;&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;I use GitHub Actions that trigger on any release to publish all of my Python projects to PyPI. I wanted to do the same for this JavaScript project.&lt;/p&gt;
&lt;p&gt;I found &lt;a href="https://docs.github.com/en/actions/use-cases-and-examples/publishing-packages/publishing-nodejs-packages#publishing-packages-to-the-npm-registry"&gt;this example&lt;/a&gt; in the GitHub documentation which gave me most of what I needed. This is in &lt;a href="https://github.com/simonw/prompts-js/blob/main/.github/workflows/publish.yml"&gt;.github/workflows/publish.yml&lt;/a&gt;:&lt;/p&gt;
&lt;div class="highlight highlight-source-yaml"&gt;&lt;pre&gt;&lt;span class="pl-ent"&gt;name&lt;/span&gt;: &lt;span class="pl-s"&gt;Publish Package to npmjs&lt;/span&gt;
&lt;span class="pl-ent"&gt;on&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;release&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;types&lt;/span&gt;: &lt;span class="pl-s"&gt;[published]&lt;/span&gt;
&lt;span class="pl-ent"&gt;jobs&lt;/span&gt;:
  &lt;span class="pl-ent"&gt;build&lt;/span&gt;:
    &lt;span class="pl-ent"&gt;runs-on&lt;/span&gt;: &lt;span class="pl-s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="pl-ent"&gt;permissions&lt;/span&gt;:
      &lt;span class="pl-ent"&gt;contents&lt;/span&gt;: &lt;span class="pl-s"&gt;read&lt;/span&gt;
      &lt;span class="pl-ent"&gt;id-token&lt;/span&gt;: &lt;span class="pl-s"&gt;write&lt;/span&gt;
    &lt;span class="pl-ent"&gt;steps&lt;/span&gt;:
      - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/checkout@v4&lt;/span&gt;
      - &lt;span class="pl-ent"&gt;uses&lt;/span&gt;: &lt;span class="pl-s"&gt;actions/setup-node@v4&lt;/span&gt;
        &lt;span class="pl-ent"&gt;with&lt;/span&gt;:
          &lt;span class="pl-ent"&gt;node-version&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;20.x&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
          &lt;span class="pl-ent"&gt;registry-url&lt;/span&gt;: &lt;span class="pl-s"&gt;&lt;span class="pl-pds"&gt;'&lt;/span&gt;https://registry.npmjs.org&lt;span class="pl-pds"&gt;'&lt;/span&gt;&lt;/span&gt;
      - &lt;span class="pl-ent"&gt;run&lt;/span&gt;: &lt;span class="pl-s"&gt;npm publish --provenance --access public&lt;/span&gt;
        &lt;span class="pl-ent"&gt;env&lt;/span&gt;:
          &lt;span class="pl-ent"&gt;NODE_AUTH_TOKEN&lt;/span&gt;: &lt;span class="pl-s"&gt;${{ secrets.NPM_TOKEN }}&lt;/span&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;There's that &lt;code&gt;--provenance&lt;/code&gt; option which only works if you have the &lt;code&gt;repository&lt;/code&gt; block set up in your &lt;code&gt;package.json&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;This needs a secret called &lt;code&gt;NPM_TOKEN&lt;/code&gt; to be set up in the GitHub repository settings.&lt;/p&gt;
&lt;p&gt;It took me a few tries to get this right. It needs to be a token created on the NPM website using the Access Tokens menu item, then Generate New Token -&amp;gt; Classic Token. As far as I can tell the new "Granular Access Token" format doesn't work for this as it won't allow you to create a token that never expires, and I never want to have to remember to update the secret in the future.&lt;/p&gt;
&lt;p&gt;An "Automation" token should do the trick here - it bypasses 2-factor authentication when publishing.&lt;/p&gt;
&lt;p&gt;Set that in GitHub Actions as a secret called &lt;code&gt;NPM_TOKEN&lt;/code&gt; and now you can publish a new version of your package to NPM by doing the following:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Update the version number in &lt;code&gt;package.json&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Create a new release on GitHub with a tag that matches the version number&lt;/li&gt;
&lt;/ol&gt;</summary></entry><entry><title>Simple trick to save environment and money when using GitHub Actions</title><link href="https://turso.tech/blog/simple-trick-to-save-environment-and-money-when-using-github-actions" rel="alternate"></link><published>2024-12-11T15:55:31.348000Z</published><id>https://turso.tech/blog/simple-trick-to-save-environment-and-money-when-using-github-actions</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/simple-trick-to-save/0:c698b4"&gt;shared this story&lt;/a&gt;
            .&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="prose prose-invert prose-quoteless prose-a:text-aquamarine prose-lg max-w-none"&gt;&lt;p&gt;We recently onboarded &lt;a class="external" href="https://x.com/SivukhinN" rel="nofollow"&gt;Nikita Sivukhin&lt;/a&gt; as a new member of our Engineering team at &lt;a class="external" href="https://turso.tech" rel="nofollow"&gt;Turso&lt;/a&gt;. He immediately started to have meaningful contributions to our &lt;a class="external" href="https://turso.tech/vector" rel="nofollow"&gt;Native Vector Search&lt;/a&gt; but something else triggered me to write this article. In addition to working on his main task, Nikita started to poke around our codebase and to fix anything he found worth tackling. This is a great proactive approach which I highly recommend to any software engineer. One thing improved by Nikita was our GitHub Actions setup to avoid running jobs that are no longer needed. This is great because GitHub Actions not only consume electricity when they run but also either cost money when used for private repositories or have some usage quota for open source projects.&lt;/p&gt;
&lt;h2 class="relative"&gt;&lt;a class="opacity-70 hover:opacity-90 pr-2 font-semibold -ml-7" href="http://#what-s-the-problem" rel="nofollow"&gt;#&lt;/a&gt;&lt;a class="external" href="http://#whats-theproblem" rel="nofollow"&gt;&lt;span class="icon icon-link"&gt;&lt;/span&gt;&lt;/a&gt;What's the problem&lt;/h2&gt;
&lt;p&gt;We use GitHub Actions for our CI/CD at &lt;a class="external" href="https://turso.tech" rel="nofollow"&gt;Turso&lt;/a&gt;. Both on open source projects and the ones that are private. Among other things, we run GitHub Actions on our Pull Requests. Some of those actions are pretty heavy and can take considerable amount of time. Rust compilation has its share but we also run all sorts of tests spanning from unit tests to end-to-end tests. It isn't uncommon for Pull Request to be updated before CI/CD is finished for the previous version. Unfortunately, GitHub does not cancel GitHub Actions for a stale version of the code and those tasks keep running until they either fail or fully finish. This is a problem because those old runs of CI/CD consume resources like electricity and GitHub Action runners even though no one is interested in the outcome of the run any more.&lt;/p&gt;
&lt;h2 class="relative"&gt;&lt;a class="opacity-70 hover:opacity-90 pr-2 font-semibold -ml-7" href="http://#solution" rel="nofollow"&gt;#&lt;/a&gt;&lt;a class="external" href="http://#solution" rel="nofollow"&gt;&lt;span class="icon icon-link"&gt;&lt;/span&gt;&lt;/a&gt;Solution&lt;/h2&gt;
&lt;p&gt;This problem can be easily solved in a universal way. If you're running your GitHub Actions on &lt;code&gt;pull_request:&lt;/code&gt; target then you just need to add the following snipped to the definition of your GitHub workflow:&lt;/p&gt;
&lt;pre&gt;&lt;code class="hljs language-yaml"&gt;concurrency:
  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
  cancel-in-progress: true
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And voilà, GitHub will start to cancel all old GitHub Actions runs that are stale after a new version of the Pull
Request was uploaded. You can see the solution in wider context in &lt;a class="external" href="https://github.com/tursodatabase/libsql/pull/1540" rel="nofollow"&gt;Nikita's Pull Request&lt;/a&gt; that added this to &lt;a class="external" href="https://turso.tech/libsql" rel="nofollow"&gt;LibSQL&lt;/a&gt; GitHub repository.&lt;/p&gt;
&lt;h2 class="relative"&gt;&lt;a class="opacity-70 hover:opacity-90 pr-2 font-semibold -ml-7" href="http://#effects" rel="nofollow"&gt;#&lt;/a&gt;&lt;a class="external" href="http://#effects" rel="nofollow"&gt;&lt;span class="icon icon-link"&gt;&lt;/span&gt;&lt;/a&gt;Effects&lt;/h2&gt;
&lt;p&gt;As a consequence of this change you will start seeing new result type in your GitHub Actions summary page. There will be not only green circle with a tick and red circle with an X but also a grey octagon with an exclamation point that means a task was cancelled. Below is a screenshot from GitHub Actions summary page of &lt;a class="external" href="https://turso.tech/libsql" rel="nofollow"&gt;LibSQL&lt;/a&gt; repository&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;During the first week after Nikita's Pull Request had been merged, 56 tasks were cancelled in &lt;a class="external" href="https://turso.tech/libsql" rel="nofollow"&gt;LibSQL&lt;/a&gt; repository alone.&lt;/p&gt;
&lt;h2 class="relative"&gt;&lt;a class="opacity-70 hover:opacity-90 pr-2 font-semibold -ml-7" href="http://#conclusion" rel="nofollow"&gt;#&lt;/a&gt;&lt;a class="external" href="http://#conclusion" rel="nofollow"&gt;&lt;span class="icon icon-link"&gt;&lt;/span&gt;&lt;/a&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;I hope that this short article was able to convince you that if you're using GitHub Actions for your CI/CD then you can easily become more environment friendly and possibly save some money on GitHub bills.&lt;/p&gt;&lt;/div&gt;</summary></entry><entry><title>Brendan Gregg's Blog</title><link href="https://www.brendangregg.com/blog/2024-10-29/ai-flame-graphs.html" rel="alternate"></link><published>2024-12-11T15:45:32.342000Z</published><id>https://www.brendangregg.com/blog/2024-10-29/ai-flame-graphs.html</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/brendan-greggs-blog/5492585:438980"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/5492585.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Brendan Gregg&amp;#x27;s Blog.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div class="post"&gt;
&lt;p&gt;Imagine halving the resource costs of AI and what that could mean for the planet and the industry -- based on extreme estimates such savings could reduce the total US power usage by over 10% by 2030&lt;sup&gt;1&lt;/sup&gt;. At Intel we've been creating a new analyzer tool to help reduce AI costs called &lt;em&gt;AI Flame Graphs&lt;/em&gt;: a visualization that shows an AI accelerator or GPU hardware profile along with the full software stack, based on my &lt;strong&gt;&lt;a class="external" href="https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html" rel="nofollow"&gt;CPU flame graphs&lt;/a&gt;&lt;/strong&gt;. Our first version is available to customers in the &lt;strong&gt;&lt;a class="external" href="https://www.intel.com/content/www/us/en/developer/tools/devcloud/services.html" rel="nofollow"&gt;Intel Tiber AI Cloud&lt;/a&gt;&lt;/strong&gt; as a preview for the Intel Data Center GPU Max Series (previously called Ponte Vecchio). Here is an example:&lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;a class="external" href="/blog/images/2024/matrixAIflamegraph.svg" rel="nofollow"&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/matrixAIflamegraph.png" width="700"/&gt;&lt;/a&gt;&lt;br/&gt;&lt;em&gt;Simple example: SYCL matrix multiply microbenchmark&lt;/em&gt;&lt;/center&gt; &lt;p&gt;(Click for interactive &lt;a class="external" href="/blog/images/2024/matrixAIflamegraph.svg" rel="nofollow"&gt;SVG&lt;/a&gt;.) The green frames are the actual instructions running on the AI or GPU accelerator, aqua shows the source code for these functions, and red (C), yellow (C++), and orange (kernel) show the CPU code paths that initiated these AI/GPU programs. The gray "-" frames just help highlight the boundary between CPU and AI/GPU code. The x-axis is proportional to cost, so you look for the widest things and find ways to reduce them.&lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/AIflamegraph-legend.png" width="150"/&gt;&lt;br/&gt;&lt;em&gt;Layers&lt;/em&gt;&lt;/center&gt; &lt;p&gt;This flame graph shows a simple program for SYCL (a high-level C++ language for accelerators) that tests three implementations of matrix multiply, running them with the same input workload. The flame graph is dominated by the slowest implementation, multiply_basic(), which doesn't use any optimizations and consumes at 72% of stall samples and is shown as the widest tower. On the right are two thin towers for multiply_local_access() at 21% which replaces the accessor with a local variable, and multiply_local_access_and_tiling() at 6% which also adds matrix tiling. The towers are getting smaller as optimizations are added.&lt;/p&gt; &lt;p&gt;This flame graph profiler is a prototype based on Intel EU stall profiling for hardware profiling and &lt;a class="external" href="https://ebpf.io/" rel="nofollow"&gt;eBPF&lt;/a&gt; for software instrumentation. It's designed to be &lt;strong&gt;easy and low-overhead&lt;/strong&gt;, just like a CPU profiler. You should be able to generate a flame graph of an existing AI workload whenever you want, without having to restart anything or launch additional code via an interposer.&lt;/p&gt; &lt;h2&gt;Instruction-offset Profiling&lt;/h2&gt; &lt;p&gt;This is not the first project to build an AI profiler or even something called an AI Flame Graph, however, others I've seen focus on tracing CPU stacks and timing accelerator execution, but don't profile the instruction offsets running on the accelerator; or do profile them but via expensive binary instrumentation. I wanted to build AI flame graphs that work like CPU flame graphs: Easy to use, negligible cost, production safe, and shows everything. A daily tool for developers, with most of the visualization &lt;em&gt;in the language of the developer&lt;/em&gt;: source code functions.&lt;/p&gt; &lt;p&gt;This has been an internal AI project at Intel for the past year. Intel was already investing in this space, building the EU stall profiler capability for the Intel Data Center GPU Max Series that provides an approximation of HW instruction sampling. I was lucky to have &lt;strong&gt;Dr. Matthew (Ben) Olson&lt;/strong&gt;, an Intel AI engineer who has also worked on eBPF performance tooling (&lt;a class="external" href="https://github.com/intel/processwatch" rel="nofollow"&gt;processwatch&lt;/a&gt;) as well as memory management research, join my team and do most of the development work. His background has helped us power through difficulties that seemed insurmountable. We've also recently been joined by &lt;strong&gt;Dr. Brandon Kammerdiener&lt;/strong&gt; (coincidentally another graduate of the University of Tennessee, like Ben), who also has eBPF and memory internals experience, and has been helping us take on harder and harder workloads. And &lt;strong&gt;Gabriel Muñoz&lt;/strong&gt; just joined today to help with releases. Now that our small team has shown that this is possible, we'll be joined by other teams at Intel to develop this further.&lt;/p&gt; &lt;p&gt;We could have built a harder-to-use and higher-overhead version months ago using Intel &lt;a class="external" href="http://binary%20instrumentation" rel="nofollow"&gt;GTPin&lt;/a&gt; but for widespread adoption it needs minimal overhead and ease of use so that developers don't hesitate to use this daily and to add it to deployment pipelines.&lt;/p&gt; &lt;h2&gt;What's a Flame Graph?&lt;/h2&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/flamegraph-cost.png" width="300"/&gt;&lt;/center&gt; &lt;p&gt;A &lt;a class="external" href="https://www.brendangregg.com/flamegraphs.html" rel="nofollow"&gt;flame graph&lt;/a&gt; is a visualization I invented in 2011 for showing sampled code stack traces. It has become the standard for CPU profiling and analysis, helping developers quickly find performance improvements and eliminate regressions. A CPU flame graph shows the "big picture" of running software, with x-axis proportional to CPU cost. The example picture on the right summarizes how easy it can be to go from compute costs to responsible code paths. Prior to flame graphs, it could take hours to understand a complex profile by reading through &lt;a class="external" href="https://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#Problem" rel="nofollow"&gt;hundreds of pages of output&lt;/a&gt;. Now it takes seconds: all you have to do is look for the widest rectangles.&lt;/p&gt; &lt;p&gt;Flame graphs have had worldwide adoption. They have been the basis for five startups so far, have been adopted in over thirty performance analysis products, and have had &lt;a class="external" href="https://www.brendangregg.com/Slides/YOW2022_flame_graphs/#8" rel="nofollow"&gt;over eighty implementations&lt;/a&gt;.&lt;/p&gt; &lt;p&gt;My first implementation of flame graphs took a few hours on a Wednesday night after work. The real effort has been in the decade since, where I worked with different profilers, runtimes, libraries, kernels, compilers, and hypervisors to get flame graphs working properly in different environments, including fixing stack walking and symbolization. Earlier this year I posted about the final missing piece: Helping distros &lt;a class="external" href="/blog/2024-03-17/the-return-of-the-frame-pointers.html" rel="nofollow"&gt;enable frame pointers&lt;/a&gt; so that profiling works across standard system libraries.&lt;/p&gt; &lt;p&gt;Similar work is necessary for AI workloads: fixing stacks and symbols and getting profiling to work for different hardware, kernel drivers, user-mode drivers, frameworks, runtimes, languages, and models. A lot more work, too, as AI analysis has less maturity than CPU analysis.&lt;/p&gt; &lt;h2&gt;Searching Samples&lt;/h2&gt; &lt;p&gt;If you are new to flame graphs, it's worth mentioning the built-in search capability. In the earlier example, most of the stall samples are caused by sbid: software scoreboard dependency. As that may be a unique search term, you can run search (Ctrl-F, or click "Search") on "sbid" and it will highlight it in magenta:&lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/AIflamegraph-search.png" width="530"/&gt;&lt;/center&gt; &lt;p&gt;Search also shows the total number of stack samples that contained sbid in the bottom right: 78.4%. You can search for any term in the flame graph: accelerator instructions, source paths, function names, etc., to quickly calculate the percentage of stacks where it is present (excluding vertical overlap) helping you prioritise performance work.&lt;/p&gt; &lt;p&gt;Note that the samples are EU stall-based, which means theoretical performance wins can take the percentages down to zero. This is different to timer-based samples as are typically used in CPU profiling. Stalls mean you better focus on the pain, the parts of the code that aren't making forward progress, but you aren't seeing resource usage by unstalled instructions. I'd like to supuport timer-based samples in the future as well, so we can have both views.&lt;/p&gt; &lt;h2&gt;Who will use this?&lt;/h2&gt; &lt;p&gt;At a recent golang conference, I asked the audience of 200+ to raise their hands if they were using CPU flame graphs. Almost every hand went up. I know of companies where flame graphs are a daily tool that developers use to understand and tune their code, reducing compute costs. This will become a daily tool for AI developers.&lt;/p&gt; &lt;p&gt;My employer will use this as well for evaluation analysis, to find areas to tune to beat competitors, as well as to better understand workload performance to aid design.&lt;/p&gt; &lt;h2&gt;Why is AI profiling hard?&lt;/h2&gt; &lt;p&gt;Consider CPU instruction profiling: This is easy when the program and symbol table are both in the file system and in a standardized file format (such as ELF) as is the case with native compiled code (C). CPU profiling gets hard for JIT-complied code, like Java, as instructions and symbols are dynamically generated and placed in main memory (the process heap) without following a universal standard. For such JITted code we use runtime-specific methods and agents to retrieve snapshots of the heap information, which is different for each runtime.&lt;/p&gt; &lt;p&gt;AI workloads also have different runtimes (and frameworks, languages, user-mode drivers, compilers, etc.) any of which can require special tinkering to get their CPU stacks and symbols to work. These CPU stacks are shown as the red, orange, and yellow frames in the AI Flame Graph. Some AI workloads are easy to get these frames working, some (like PyTorch) are a lot more work. &lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/AIsourcezoom.png" width="450"/&gt;&lt;/center&gt; &lt;p&gt;But the real challenge is instruction profiling of actual GPU and AI accelerator programs -- shown as the aqua and green frames -- and correctly associating them with the CPU stacks beneath them. Not only may these GPU and AI programs not exist in the file system, but they may not even exist in main memory! Even for running programs. Once execution begins, they may be deallocated from main memory and only exist in special accelerator memory, beyond the direct reach of OS profilers and debuggers. Or within reach, but only through a prohibitively high-overhead HW-specific debugger interface.&lt;/p&gt; &lt;p&gt;There's also no /proc representation for these programs either (I've been proposing building an equivalent) so there's no direct way to even tell what is running and what isn't, and all the other /proc details. Forget instruction profiling, even ps(1) and all the other process tools do not work.&lt;/p&gt; &lt;p&gt;It's been a mind-bending experience, revealing what gets taken for granted because it has existed in CPU land for decades: A process table. Process tools. Standard file formats. Programs that exist in the file system. Programs running from main memory. Debuggers. Profiliers. Core dumping. Disassembling. Single stepping. Static and dynamic instrumentation. Etc. For GPUs and AI, this is all far less mature. It can make the work exciting at times, when you think something is impossible and then find or devise a way.&lt;/p&gt; &lt;p&gt;Fortunately we have a head start as some things do exist. Depending on the runtime and kernel driver, there are debug interfaces where you can list running accelerator programs and other statistics, as used by tools like intel_gpu_top(1). You can kill -9 a GPU workload using intel_gpu_abrt(1). Some interfaces can even generate basic ELF files for the running accelerator programs that you can try to load in a debugger like gdb(1). And there is support for GPU/AI program disassembly, if you can get your hands on the binary. It feels to me like GPU/AI debugging, OS style, is about two years old. Better than zero, but still early on, and lots more ahead of us. A decade, at least.&lt;/p&gt; &lt;h2&gt;What do AI developers think of this?&lt;/h2&gt; &lt;p&gt;We've shown AI Flame Graphs to other AI developers at Intel and a common reaction is to be a bit puzzled, wondering what to do with it. AI developers think about their bit of code, but with AI Flame Graphs they can now see the entire stack for the first time, including the HW, and many layers they don't usually think about or don't know about. It basically looks like a pile of gibberish with their code only a small part of the flame graph.&lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;a class="external" href="https://www.brendangregg.com/Slides/YOW2022_flame_graphs/#8" rel="nofollow"&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/flamegraph-montage.png" width="190"/&gt;&lt;/a&gt;&lt;br/&gt;&lt;em&gt;CPU Flame Graph Implementations&lt;/em&gt;&lt;/center&gt; &lt;p&gt;This reaction is similar to people's first experiences with CPU flame graphs, which show parts of the system that developers and engineers typically don't work on, such as runtime internals, system libraries, and kernel internals. Flame graphs are great at highlighting the dozen or so functions that matter the most, so it becomes a problem of learning what those functions do across a few different code bases, which are typically open source. Understanding a dozen such functions can take a few hours or even a few days -- but if this leads to a 10% or 2x cost win, it is time well spent. And the next time the user looks at a flame graph, they start saying "I've seen that function before" and so on. You can get to the point where understanding the bulk of a CPU flame graph takes less than a minute: look for the widest tower, click to zoom, read the frames, done.&lt;/p&gt; &lt;p&gt;I'm encouraged by the success of CPU flame graphs, with over 80 implementations and countless real world case studies. Sometimes I'm browsing a performance issue I care about on github and hit page down and there's a CPU flame graph. They are everywhere.&lt;/p&gt; &lt;p&gt;I expect AI developers will also be able to understand AI Flame Graphs in less than a minute, but to start with people will be spending a day or more browsing code bases they didn't know were involved. Publishing case studies of found wins will also help people learn how to interpret them, and also help explain the value.&lt;/p&gt; &lt;h2&gt;What about PyTorch?&lt;/h2&gt; &lt;p&gt;Another common reaction we've had is that AI developers are using PyTorch, and initially we didn't support it as it meant walking Python stacks, which isn't trivial. But prior work has been done there (to support CPU profiling) and after a lot of tinkering we now have the first PyTorch AI Flame Graph:&lt;/p&gt; &lt;p&gt;&lt;/p&gt;&lt;center&gt;&lt;a class="external" href="/blog/images/2024/PyTorchFlamegraph.svg" rel="nofollow"&gt;&lt;img alt="" src="https://www.brendangregg.com/blog/images/2024/PyTorchFlamegraph.png" width="700"/&gt;&lt;/a&gt;&lt;br/&gt;&lt;em&gt;PyTorch frames in pink &lt;/em&gt;&lt;/center&gt; &lt;p&gt;(Click for interactive &lt;a class="external" href="/blog/images/2024/PyTorchFlamegraph.svg" rel="nofollow"&gt;SVG&lt;/a&gt;.) The PyTorch functions are at the bottom and are colored pink. This example runs oneDNN kernels that are JIT-generated, and don't have a source path so that layer just reads "jit". Getting all other the layers included was a real pain to get going, but an important milestone. We think if we can do PyTorch we can do anything.&lt;/p&gt; &lt;p&gt;In this flame graph, we show PyTorch running the Llama 2 7B model using the Intel Extensions for PyTorch (IPEX). This flame graph shows the origin of the GPU kernel execution all the way back to the Python source code shown in pink. Most samples are from a stack leading up to a gemm_kernel (matrix multiply) shown in aqua, which like the previous example has many stalls due to software scoreboarding.&lt;/p&gt; &lt;p&gt;There are two instructions (0xa30 and 0xa90) that combined are 27% of the entire profile. I expect someone will ask: Can't we just click on instructions and have it bring up a dissassembly view with full source? Yes, that should be possible, but I can't answer how we're going to provide this yet. Another expected question I can't yet answer: Since there are now multiple products providing AI auto-tuning of CPU workloads using CPU flame graphs (including &lt;a class="external" href="https://granulate.io/" rel="nofollow"&gt;Intel Granulate&lt;/a&gt;) can't we have AI auto-tuning of &lt;em&gt;AI&lt;/em&gt; workloads using AI Flame Graphs?&lt;/p&gt; &lt;h2&gt;First Release: Sometimes hard and with moderate overhead&lt;/h2&gt; &lt;p&gt;Getting AI Flame Graphs to work with some workloads is easy, but others are currently hard and cost moderate overhead. It's similar to CPU profiling, where some workloads and languages are easy to profile, whereas others need various things fixed. Some AI workloads use many software dependencies that need various tweaks and recompilation (e.g., enabling frame pointers so that stack walking works) making setup time consuming. PyTorch is especially difficult and can take over a week of OS work to be ready for AI Flame Graphs. We will work on getting these tweaks changed upstream in their respective repositories, something involving teams inside and outside of Intel, and is a process I'd expect to take at least a year. During that time AI workloads will gradually become easier to flame graph, and with lower-overhead as well.&lt;/p&gt; &lt;p&gt;I'm reminded of eBPF in the early days: You had to patch and recompile the kernel and LLVM and Clang, which could take multiple days if you hit errors. Since then all the eBPF dependency patches have been merged, and default settings changed, so that eBPF "just works." We'll get there with AI Flame Graphs too, but right now it's still those early days.&lt;/p&gt; &lt;p&gt;The changes necessary for AI Flame Graphs are really about improving debugging in general, and are a requirement for &lt;a class="external" href="https://www.brendangregg.com/Slides/eBPFSummit2023_FastByFriday/" rel="nofollow"&gt;Fast by Friday&lt;/a&gt;: A vision where we can root-cause analyze anything in five days or less.&lt;/p&gt; &lt;h2&gt;Availability&lt;/h2&gt; &lt;p&gt;AI Flame Graphs will first become available on the &lt;a class="external" href="http://yes,%20Intel%20has%20a%20public%20cloud" rel="nofollow"&gt;Intel Tiber AI Cloud&lt;/a&gt; as a preview feature for the Intel Data Center GPU Max Series. If you are currently deployed there you can ask through the Intel service channel for early access. As for if or when it will support other hardware types, be in other Intel products, be officially launched, be open source, etc., these involve various other teams at Intel and they need to make their own announcements before I can discuss them here.&lt;/p&gt; &lt;h2&gt;Conclusions&lt;/h2&gt; &lt;p&gt;Finding performance improvements for AI data centers of just fractions of a percent can add up to planetary savings in electricity, water, and money. If AI flame graphs have the success that CPU flame graphs have had, I'd expect finding improvements of over 10% will be common, and 50% and higher will eventually be found*. But it won't be easy in these early days as there are still many software components to tweak and recompile, and software layers to learn about that are revealed in the AI flame graph.&lt;/p&gt; &lt;p&gt;In the years ahead I imagine others will build their own AI flame graphs that look the same as this one, and there may even be startups selling them, but if they use more difficult-to-use and higher-overhead technologies I fear they could turn companies off the idea of AI flame graphs altogether and prevent them from finding sorely needed wins. This is too important to do badly. AI flame graphs should be easy to use, cost negligible overhead, be production safe, and show everything. Intel has proven it's possible.&lt;/p&gt; &lt;h2&gt;Disclaimer&lt;/h2&gt; &lt;p&gt;
* This is a personal blog post that makes personal predictions but not guarantees of possible performance improvements. Feel free to take any claim with a grain of salt, and feel free to wait for an official publication and public launch by Intel on this technology.&lt;/p&gt; &lt;p&gt;&lt;sup&gt;1&lt;/sup&gt; Based on halving the Arm CEO Rene Haas' estimate of 20-25% quoted in &lt;a class="external" href="https://arstechnica.com/ai/2024/06/is-generative-ai-really-going-to-wreak-havoc-on-the-power-grid/" rel="nofollow"&gt;Taking a closer look at AI's supposed energy apocalypse&lt;/a&gt; by Kyle Orland of ArsTechnica.
&lt;/p&gt; &lt;h2&gt;Thanks&lt;/h2&gt; &lt;p&gt;&lt;em&gt;Thanks to everyone at Intel who have helped us make this happen. Markus Flierl has driven this project and made it a top priority, and Greg Lavender has expressed his support. Special thanks to Michael Cole, Matthew Roper, Luis Strano, Rodrigo Vivi, Joonas Lahtinen, Stanley Gambarin, Timothy Bauer, Brandon Yates, Maria Kraynyuk, Denis Samoylov, Krzysztof Raszknowski, Sanchit Jain, Po-Yu Chen, Felix Degrood, Piotr Rozenfeld, Andi Kleen, and all of the other coworkers that helped clear things up for us, and thanks in advance for everyone else who will be helping us in the months ahead.&lt;/em&gt;&lt;/p&gt; &lt;p&gt;My final thanks is to the companies and developers who do the actual hands-on work with flame graphs, collecting them, examining them, finding performance wins, and applying them.&lt;br/&gt;You are helping save the planet.&lt;/p&gt; &lt;/div&gt;</summary></entry><entry><title>BadRAM: Historischer Seitenkanal hebelt Confidential Computing in der Cloud aus</title><link href="https://www.heise.de/news/BadRAM-Historischer-Seitenkanal-hebelt-Confidential-Computing-in-der-Cloud-aus-10193941.html" rel="alternate"></link><published>2024-12-11T07:08:46.427000Z</published><author><name>nomail@bock.nu (No Author)</name></author><id>https://www.heise.de/news/BadRAM-Historischer-Seitenkanal-hebelt-Confidential-Computing-in-der-Cloud-aus-10193941.html</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/badram-historischer-/8848229:5cacb1"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/8848229.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Heise Online.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

None</summary><category term="c_t_magazin"></category></entry><entry><title>Split tunneling using Wireguard and namespaces - Thea Flowers</title><link href="https://blog.thea.codes/nordvpn-wireguard-namespaces/" rel="alternate"></link><published>2024-12-10T21:50:25.200000Z</published><id>https://blog.thea.codes/nordvpn-wireguard-namespaces/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/split-tunneling-usin/7297635:e43932"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://www.newsblur.com/rss_feeds/icon/7297635" style="vertical-align: middle;width:16px;height:16px;"&gt; blog.thea.codes.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

</summary></entry><entry><title>Lazy self-installing Python scripts with uv</title><link href="https://treyhunner.com/2024/12/lazy-self-installing-python-scripts-with-uv/" rel="alternate"></link><published>2024-12-10T21:45:27.010000Z</published><id>https://treyhunner.com/2024/12/lazy-self-installing-python-scripts-with-uv/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/lazy-self-installing/4690472:4b9182"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/4690472.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Trey Hunner.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

</summary></entry><entry><title>Ubiquitous Successful Bus: Hacking USB 2 Hubs</title><link href="https://hackaday.com/2024/11/05/ubiquitous-successful-bus-hacking-usb-2-hubs/" rel="alternate"></link><published>2024-11-12T16:08:19.673000Z</published><author><name>Arya Voronova</name></author><id>https://hackaday.com/2024/11/05/ubiquitous-successful-bus-hacking-usb-2-hubs/</id><summary type="html">&lt;table style="border: 1px solid #E0E0E0; margin: 0; padding: 0; background-color: #F0F0F0" valign="top" align="left" cellpadding="0" width="100%"&gt;
    &lt;tr&gt;
        &lt;td rowspan="2" style="padding: 6px;width: 36px;white-space:nowrap" width="36" valign="top"&gt;&lt;img src="https://www.gravatar.com/avatar/c7974846cecc4d764f6e3bfe203a0954" style="width: 36px; height: 36px; border-radius: 4px;"&gt;&lt;/td&gt;
        &lt;td width="100%" style="padding-top: 6px;"&gt;
            &lt;b&gt;
                bernhardbock 
                &lt;a href="https://bernhardbock.newsblur.com/story/ubiquitous-successfu/6031118:3ee64b"&gt;shared this story&lt;/a&gt;
            from &lt;img src="https://s3.amazonaws.com/icons.newsblur.com/6031118.png" style="vertical-align: middle;width:16px;height:16px;"&gt; Blog – Hackaday.&lt;/b&gt;
        &lt;/td&gt;
    &lt;/tr&gt;
    
&lt;/table&gt;

&lt;hr style="clear: both; margin: 0 0 24px;"&gt;

&lt;div&gt;&lt;img alt="" class="attachment-large size-large wp-post-image" height="450" src="https://hackaday.com/wp-content/uploads/2024/04/usb-featured.jpg?w=800" style="margin: 0 auto; margin-bottom: 15px;" tabindex="0" width="800" /&gt;&lt;/div&gt;&lt;p&gt;&lt;a href="https://hackaday.com/2024/10/17/ubiquitous-successful-bus-version-2/"&gt;We&amp;#8217;ve been recently looking into USB 2.0&lt;/a&gt; &amp;#8211; the ubiquitous point-to-point communications standard. USB 2 is completely different from USB 3, the blue-connector next-generation USB standard. For instance, USB 2 is a full-duplex pseudo-differential bus, and it&amp;#8217;s not AC-coupled. This makes USB2 notoriously difficult to galvanically isolate, as opposed to USB 3.  On the other hand, USB 2 is a lot easier to incorporate into your projects. And perhaps the best way to do so is to implement a USB hub.&lt;/p&gt;
&lt;p&gt;USB 2 hubs are, by now, omnipresent. it doesn&amp;#8217;t cost much to add to your board, and you truly have tons of options. The standard option is 4-port hubs &amp;#8211; one uplink port to your host, four downlink ports to your devices. If you only have two or three devices, you might be tempted to look for a hub IC with a lower amount of ports, but it&amp;#8217;s not worth bothering &amp;#8211; just use a 4-port chip, and stock up on them.&lt;/p&gt;
&lt;p&gt;What about 7-port chips? You will see those every now and then &amp;#8211; but take a close look at the datasheet. Some of them will be two 4-port chips inside a single package, with four of the ports bottlenecked compared to the three other ports &amp;#8211; watch out! Desktop 7-port hubs are basically guaranteed to use two 4-port ICs, too, so, again, watch out for bottlenecks. &lt;code&gt;lsusb -t&lt;/code&gt; will help you determine the hub&amp;#8217;s structure in case you don&amp;#8217;t want to crack its case open, thankfully.&lt;/p&gt;
&lt;p&gt;Recommendations? I use SL2.1 chips &amp;#8211; they&amp;#8217;re available in an SO16 package, very unproblematic, to-the-point pinout and easily hand-solderable. CH334 is a close contender, but watch out because there are different variants of this chip that differ by both package and pinout, so if you&amp;#8217;re buying a chip with a certain letter, you will want to stick to it. Not just that, be careful &amp;#8211; different variants run out at different rates, so if you lock yourself into a CH334 variant, consider stocking up on it.&lt;span id="more-725468"&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;There&amp;#8217;s no shortage of Western-origin chips, either &amp;#8211; Texas Instruments is a leader here no doubt. If you ever fear running out of hub ICs in your stock while assembling something, you can prepare for this in advance by leaving zero-ohm footprints under the hub&amp;#8217;s package. USB 2 doesn&amp;#8217;t care for stubs much, and such a hack is very easy to do with SL2.1 in particular. Got two extra ports left over? Put them on a PC-case style dual USB2 9-pin header &amp;#8211; there&amp;#8217;s never a shortage of fun accessories compatible with it!&lt;/p&gt;
&lt;p&gt;Powering USB2 hub ICs is easy &amp;#8211; they tend to include a 5 V to 3.3 V linear regulator inside, so you can power them from a 5 V source directly. On the other hand, if you don&amp;#8217;t have any 5 V to spare, the overwhelming majority of hub ICs can be powered from 3.3 V directly &amp;#8211; usually, that requires shorting the hub&amp;#8217;s 5 V input to 3.3 V, but not necessarily. If the datasheet is unclear on 3.3 V-only operation, leave in some 0R jumpers. And, of course, make sure to add 100 nF or similar capacitors &amp;#8211; one per hub IC&amp;#8217;s power pin. Remember the disclaimer about built-in RC oscillators in MCUs being imprecise? Same goes for hubs &amp;#8211; if your hub boasts an internal RC oscillator, don&amp;#8217;t trust it, make sure you have a crystal footprint you can populate if you get stability issues.&lt;/p&gt;
&lt;p&gt;Putting some USB port pins to the outside world? You will want to protect them from harm &amp;#8211; or, rather, you will want to protect your expensive CPU from harm.&lt;/p&gt;
&lt;h2&gt;Please, Consider ESD Diodes&lt;/h2&gt;
&lt;figure class="wp-caption alignright" id="attachment_705446" style="width: 400px;"&gt;&lt;img alt="" class="wp-image-705446 size-medium" height="310" src="https://hackaday.com/wp-content/uploads/2024/08/hadimg_usb2_2.png?w=400" tabindex="0" width="400" /&gt;&lt;figcaption class="wp-caption-text" id="caption-attachment-705446"&gt;The black SOT23-6 footprint is a group of ESD diodes &amp;#8211; small, cheap, and it&amp;#8217;s easy to add in case you ever need it, which you very well might.&lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Bringing USB somewhere far, or even just using it as your link to the external world? You should really use ESD diodes &amp;#8211; or at least plan them in and give yourself the option to populate them later. There&amp;#8217;s no shortage of USB2-capable ESD diodes, after all, and ESD problems are closer than you might expect.&lt;/p&gt;
&lt;p&gt;For instance, I&amp;#8217;ve recently built a pocket device consisting of a battery-powered Pi Zero and a USB soundcard connected to wired headphones, with a pretty standard kind of long cable. I wear a lot of synthetic clothes, in particular, hoodies and jackets, and I kept having the Pi reboot every time I took my jacket off or put it on, through static electricity induced into the headphone wires through the cable insulation, going into the USB port on the Pi Zero.&lt;/p&gt;
&lt;p&gt;So, I went and put ESD diodes on the USB 2 pins, using the footprint I previously added to my board &amp;#8220;just in case&amp;#8221; but didn&amp;#8217;t populate, and this failure mode has instantly disappeared for good. Remember, footprints are free, and bodges cost time. Want a recommendation? The four-channel diodes are pretty good for USB 2; look for the SRV-05 footprint in KiCad, in the SOT-23-6 package. It&amp;#8217;s a generic enough footprint that there&amp;#8217;s no shortage of ESD diode packs in the same footprint, they&amp;#8217;re low-capacity enough that you can even use it for purposes like captouch pad protection, and they will also work for applications like Ethernet or externally available GPIOs.&lt;/p&gt;
&lt;p&gt;Do you need ESD diodes? Yes, just add the footprint. Same goes for over-current control switches, by the way &amp;#8211; I&amp;#8217;ve already talked about the SY6820, but it bears repeating. Your entire system doesn&amp;#8217;t have to reboot when you short-circuit a USB port on the board, and a cheap current-limited switch IC will let you ensure that&amp;#8217;s the case, while also letting you switch the port power on and off, as a nice bonus.&lt;/p&gt;
&lt;p&gt;This was just a few tips on and around USB 2 hubs and connectors, but I hope it helps you out with your projects.&lt;/p&gt;</summary><category term="hackaday columns"></category><category term="hardware"></category><category term="usb"></category><category term="usb 2"></category><category term="usb hub"></category></entry></feed>