2025 Year in Review: AI Explorations
It is December, 2025 and I find myself reflecting on a very eventful year for me - a year of prolific building, exploration, reflection and scaling up - both at work and in terms of technical and intellectual side projects that touch AI. We've all been bombarded by AI innovations and enhancements from the frontiers and I am no different. This has been a year of plenty in that sense, in the sense of discovering how to work with AI tools to build, explore, and scale that which we want to do. The new AI projects both on the work front and open source AI innovation on the personal front have been plentiful.
This year I made 587 GitHub contributions, across 18 new repositories, and published two frameworks - Praval and Vajra Search - on PyPI. Professionally, this year saw tens of new initiatives at work, and have been for the past few months in an expanded role at work that touches the strategy of how we use and build AI features, how we build with AI, and more. The above contributions are not including the stuff I do at work - where I lead a team building a SaaS platform for HR.
Looking back on 2025, I find a tension running through everything I built, everything I abandoned, and everything I learned. This tension was born out of a recognition that the field of AI and many other fields as of 2025 stand forever changed. Professionals like me are using AI tools to become more productive and expand our execution capabilities on the one hand, even as we are aware of the dependency on AI tools on the other hand. This has led to a tension I have discussed elsewhere - of needing to use the tools on the one hand to benefit from them, and on the other hand, not wanting to abandon the craft of programming and engineering.
In this post, I will take you through what it was to build, explore and do much more than I would individually have been able to do with the help of AI, and what I learned from these experiences. I will also walk through Praval, Vajra, and other things I worked on.
2025 at Work - A Short Summary
At work, I brought both greater innovation and greater professional growth to my team. We built a credible capability in building AI agents end-to-end, and have been leveraging this for numerous projects. I believe that the older LLM powered workflows will soon be replaced by AI Agents for a number of use cases. The AI teams I've worked with, which are my own and the team I collaborate with, have been incredible.
From a people standpoint, finding the right talent brings me a lot of satisfaction these days, and my team have coalesced into an effective unit in the last eight months. In mid 2025, I was promoted to a broader role, and this was a vindication of some of the calls I have had to take, with respect to career, team, company and more. I'd say looking back that deep involvement and an innovation mindset have been strong contributors to my own growth. Despite running a remote team, synergies have been possible, thanks to the leadership and the horizon-scanning we all in AI leadership do.
The innovation capacity of this team, and a forward-thinking leadership and strategy team keep me interested - these are the nuts and bolts of how I have been able to deliver through others effectively at work. It takes a new breed of leader to successfully navigate the changing landscape of new models, new benchmarks, new AI tools, while handling the interpersonal and team dynamics of a team that has high-end talent and which is shipping AI products. I am grateful to be working with such leaders at work. There's never been a more VUCA environment than 2023 for me, but 2025 was close. In a sense, this was a good kind of uncertainty, driven by innovation and not Covid or post-Covid based risk. That is perhaps why I think this year was a great test of our ability to stay relevant as a team and for me personally, as a leader and innovator in the AI space.
My team is trying to stay relevant in the fast-changing software engineering space. As I lead a team that's building software for a more traditional industry that's being yanked into the world of AI powered SaaS, we had to learn to build and ship at scale and fast. Despite processes, constraints, architecture support, and lots of resources, it takes strong behavioural changes to actually ship things fast and at high quality. For me, this has meant digging into every system at work, be it our clusters, our repos across products and how we run them, how the team tests and ships things, how we write specifications, etc. One of the big enablers here too has been AI. I have not seen a single user story in the last 3 months at work which did not have objectives, acceptance criteria and the like written out in it. Rarely have seen a PR of late without a good conventional commit style description. This took a while, and the tooling has, I feel, finally caught up with the ambitions I have had for our standards of work. While we have always worked in two week sprints for the last several years, of late, we tried the 3 week development sprint with a testing mini sprint that's bolted on to it, to test and ship things each month. A new SDLC realignment helped us go back to the faster 2 week sprint's pace.
I've honestly grown quite fond of the team I'm working with and leading at work in the last year. The incredible innovation and speed they've shown is especially inspiring, especially on AI agents and the process of building full end-to-end features with them. Not only does my team understand the fast changing landscape of tools, technologies and capabilities, but are able to scale up to new challenges and innovate really fast. This has made all the difference to our effectiveness over the last year.
Coding Assistants and AI Powered Tools in 2025 and Looking Ahead
All through the year, my team's uptake of Cursor and Agentic IDEs has been a significant phenomenon. This inspired me to use Claude Code, to which I have effectively switched, for all my personal projects and open source projects. From rules and configurations that make Cursor effective for us, to subagents, extensive process level checks, and many more innovations, the results of AI powered software engineering have been commendable. A big realization for me personally, has been that there's no going back. Software will be written by AI agents, and we will be building AI agents for different use cases in future. In this sense the "AI is eating software" narrative continues from 2025 into 2026 as well, since the innovation in that space does not stop. There are caveats just as there are with any technology - AI slop is one of the defininig negative trends of 2025. While this is discussed in the context of the enshittification of the internet and its content, there is something to be said for the same for code. Vibe coded applications have gone mainstream in 2025, and at work too, there was a tension on the use of coding assistants. My best engineers leverage AI tools for what it is good at. Although the puck keeps moving on this one, as models and agents get better, it helps to be prepared to write things the way humans wrote them for years before the advent of AI agents and coding assistants.
I learnt many other lessons that threw caution to the winds as well. Chiefly, the architectural and first principles thinking that's crucial to us, and as I say to my team - "the need to be the pilot even if you have an AI copilot". This is crucial. The direction and the vision for any product or system we build with AI or without has to be owned by humans, and not by AI. Many friends feel that GPT-5.2, Claude Opus 4.5 are two-of-a-kind models as of December 2025, and we don't know if this represents a form of artificial super intelligence. (Note that I didn't say artificial general intelligence as I think that is a pipe dream.) I would not entirely disagree with them, because I think scaling laws have gotten to a point where complex internal representations in deep learning models can yield true conceptual understanding. This all reminds me of the Grokking paper, which describes how neural networks actually learn. Why is this relevant? Well, if an LLM is an ideation partner, where do we draw the line in terms of what is directed by human intelligence and what is directed by AI? Where do we draw the line in terms of the genesis or the validation of ideas? I am reminded of a talk by Qodo founder Itamar Friedman here - his pivot from building coding assistants to building code quality tools for AI coding assistants is an interesting pivot. The standards we need can only be defined by humans, and not by AI. And this is possible by humans because the biggest context engineering pipeline is the physical layer of reality. That might sound cheeky, but it is perhaps the truth. This crucial mental model has to be internalized if AI professionals have to stay relevant in 2026 and beyond.
Another key lesson from 2025, is that ideas are extremely cheap, and execution is becoming cheaper. The GLM-4.x models from Z.ai have become both inexpensive to run and a worthy code generation competitor to Claude Code with the Claude Opus 4.5 model. This has happened just in the past few days, and I expect this momemtum to carry on into 2026. Kimi K2 Thinking and DeepSeek v3.2 represent SOTA AI models that are available at the fraction of the cost of the big $200/month subscriptions from Claude Code or OpenAI. This makes them more than suitable for content and code generation at scale and low cost. I expect that in 2026, tokens will become too cheap to meter for many SOTA models. The ChatGPT Go subscription launched in India is an example of a big lab taking advantage of this phenomenon, and we are likely to see others follow suit. This race to the bottom will make advanced capabilities table stakes more often in 2026, and have a further impact on the use of these tools for work, coding and so on. As of this writing in December 2025, AI agents are still unable to build, test and deploy entire software applications yet at scale with one prompt or one click. I think this is set to change in the coming year or two. OpenAI's founder alluded to "on-demand software" as being a trend during his talk at the GPT-5 launch. I personally think this is a secular trend, like small language models have been in 2025. This will all help make coding assistants become less expensive, faster and higher quality in 2026.
Personal Projects and AI Explorations: What I Built in 2025
Building agentic AI applications in 2025 was a natural extension to the LLM based applications I built in 2024. While Ollama and local LMs were the flavour of last year, the reducing costs of AI APIs has meant that I did a lot more experimentation in 2025 with OpenAI. This year has also seen me embrace Claude Code and terminal UIs and back track on using Cursor a little bit for personal coding projects and open source projects. Cursor still remains a competent agentic IDE and especially after Cursor's Composer model, seems to have been rejuvenating developer interest in the tool. Cursor helped me build a number of applications earlier in the year, but the bulk of what I've built in the latter part of the year has been with Claude.
Architectural thinking is one of the biggest areas where I have gained from these experiences. As someone who came into 2025 without a lot of experience building full stack applications, this year has been transformative. Having spent the initial portion of the year at work building on top of agentic AI and MCP capabilities, I found myself pivoting to my own tooling in the latter part of the year for these tasks. I wrote a few MCP servers and my own agentic AI framework, both of which were exciting developments.
As the year progressed, I found myself exploring frontiers I had not touched a lot more. A good example of this is Tlon mathematics, and another is category theory based search, viz. Vajra Search. I also dipped into mechanistic interpretability of deep learning and transformer models later in the year. I will discuss these below. Behind this front of three or four open source projects that are meaty and substantial, there were at least a hundred experiments on various topics.
Below, I cover each of these projects in a little bit of detail. On this site, there are other posts that describe numerous challenges, issues and obstacles I have overcome in the year, and how I did so. Here, though, there is a short summary in each case.
Praval
The personal project I am most excited about this year is a multi-agent AI framework where agents collaborate like coral polyps forming a reef. This is now open source, on PyPI and Github and eagerly seeking contributors to build on top of it. The name I chose for this framework is Praval, which is Sanskrit word for coral (प्रवाल), and the metaphor runs deep; just as coral reefs emerge from simple organisms coordinating without central control, Praval agents broadcast knowledge through "spores" and respond to what they find relevant, with no manager directing traffic.
The framework reached version 0.7.20 this year, published on PyPI, with a decorator-based API that feels natural to Python developers:
@agent("researcher", responds_to=["query"])
def researcher(spore):
findings = chat(f"Research: {spore.knowledge['topic']}")
broadcast({"type": "analysis_request", "data": findings})
Praval allows for a similar decorator for tools (@tool), native memory capabilities using Chroma DB, and with support for Qdrant. Agent to agent communication is a native feature of Praval, as is knowledge passing between agents using Spores. Check out the framework at the official website.
What excites me about Praval isn't just the code; it's what it represents. Most agent frameworks assume a hierarchical structure, with orchestrators directing workers. Praval bets on emergence, on the idea that intelligence can arise from peer interactions without anyone being in charge. Whether that bet pays off in production systems remains to be seen, but the exploration has been valuable regardless.
I also found out where Praval does not work well - Praval Code being one such project. Code generation agents, like Claude Code, which I love using for all my work, are not a good fit for Praval. The core reason for this is that code generation seems to be favoured by hierarchical, sub-agent patterns, unlike Praval, which is set up to be suitable for large scale multi-agent collaboration without a central orchestrator. I tried building Praval Code but unsuccessfully. That said, I expect this to be a fruitful area of exploration in the future.
The ecosystem for Praval has Praval Deep Research as a showcase project. This is a cool local-first app for researchers to find papers on ArXiv and chat with their findings. Apart from other experimental projects I've been exploring with friends, such as Praval Analytics, I've also been sketching out Praval Medha, a conceptual system where agents spawn other agents based on problem requirements; agents that architect agents. This is a project I am very excited about, and I expect to be able to build on it in the coming year.
What is the end-game here with Praval? I was asked this by many friends and mentors. Making Praval an open source project allowed it to become a community project, and I expect this to be a fruitful area of exploration in the future. Thanks to incredible collaborators like Bargava and mentors like CM I have been able to sink some time into this framework. What I look forward to with Praval:
- Contributors who I can work with to add many new features for Praval
- Users who can build on top of Praval to create new and innovative applications
- Users who can help me build a community around Praval and educate and engage with developers who use Praval
GitHub: github.com/aiexplorations/praval | PyPI: pypi.org/project/praval
Related Project: github.com/aiexplorations/praval_deep_research
Vajra BM25
If Praval was about agentic AI and multi-agent collaboration, Vajra BM25 was about mathematical foundations and high-performance search powered by a combination of clean and composable search abstractions and low latency performance. The name "Vajra" comes from Sanskrit (वज्र, "thunderbolt"), and the project began as an experiment: could category theory abstractions make a search engine's code cleaner? The inspiration came from Bartosz Milewski's lectures on category theory, combined with work on Elasticsearch at my day job.
With category theory, I was able to frame BM25 search using coalgebras (state → possible next states) and morphisms (composable transformations). Search becomes "coalgebraic unfolding" where a query state unfolds into ranked results. The codebase has a categorical/ module with Morphism, Functor, and Coalgebra base classes that derived implementations extend.
There's something interesting that I learnt about category theory and its relevance and applicability to search in this project. In a nutshell, category theory did not make Vajra fast. The speed came from engineering choices - NumPy vectorization, sparse matrices, LRU caching, inverted index filtering, and partial sort for top-k. What category theory provided was clean code organization and a unified interface that works for both graph search and document retrieval. This is more valuable than it seems, because when you get into the numbers game, you tend to look at pure performance at all costs without due attention to the underlying architecture. And as I said earlier, the architecture is one area I have been paying a lot of attention to this year.
The latest benchmarks show Vajra achieving 180,000-800,000 queries per second on BEIR and Wikipedia datasets, outperforming BM25S by ~1,000x, Tantivy (Rust) by ~4,000x, and Pyserini (Lucene) by ~2,000x. We know that real workloads in production scale systems have repeated queries and are to be built with caches. After the first execution, results are cached; subsequent calls return in microseconds. Cold queries take ~0.5ms; warm queries take ~0.001ms. This is a good example of how architecture and engineering choices can make a difference in performance.
The project reached v0.3.0 on PyPI this year, with an interactive CLI (vajra-search) that lets you point at any JSONL corpus and have a search engine in your terminal. There's an extensive blog post documenting the benchmarks and architecture if you want the full technical deep-dive.
GitHub: github.com/aiexplorations/vajra_bm25 | PyPI: pypi.org/project/vajra-bm25
Tlön (Tlon) Mathematics
Inspired by Borges' "Tlön, Uqbar, Orbis Tertius" - a story about a world where language has no nouns, only verbs - I built a mathematical framework where processes are primitive and objects emerge as stable patterns. The name comes from Sanskrit "Tlön" (though Borges invented the word), and the project began as an experiment: could we formalize the idea that a rock isn't a thing but a pattern of atomic processes that happens to be stable?
The framework is built on 20 axioms across 6 groups, with the key definition being stability: $\text{Stable}(\pi) \Leftrightarrow \pi \circ \pi \approx \pi$. A stable process is one where doing it twice is equivalent to doing it once - these are the emergent "objects" of Tlön.
One of the most striking results is the Doctrine of No Inverses: for any process with positive duration, there exists no inverse that annihilates it to nothing. Happenings cannot un-happen. The correct concept is reversal (completing a stable cycle), not inverse (annihilation).
The simulations bring these ideas to life:
- Double pendulum: Demonstrates transient processes - chaotic trajectories that never return to themselves
- N-body problem: Shows that stability is special, not generic - N=2 is stable, but adding just one body (N=3) destroys stability
- Lotka-Volterra predator-prey: Demonstrates resonance - two individually unstable processes (prey and predator) that together form a stable oscillating pattern
Additionally, Tlon mathematics has been used to build machine learning algorithms. I've implemented several common machine learning models (traditional ML algorithms, as well as deep learning models such as Dense Networks, ConvNets and Transformers). These implementations have been built on top of Tlon abstractions, and while they use the existing linear algebra, optimization and other applications, they help frame model training in Tlon terms - stability, emergence and other vital concepts from Tlon mathematics are being discussed in the context of ML.
There's an extensive blog post documenting the axioms, theorems, code structure, and simulations if you want the full deep-dive. I'm seeking mathematician review to check the proofs and identify gaps.
GitHub: github.com/aiexplorations/tlon_math
The (Incomplete) Book
I began writing a book on ML/AI design patterns; 128 patterns across 14 chapters, roughly 450 pages. It covered foundation model integration, agentic AI, MLOps, safety, governance. Active delivery through February 2026. The discipline of writing for publication, of having to explain ideas clearly enough for strangers to understand, has been as valuable as the technical work itself.
Unfortunately, I wasn't able to do justice to the material of the book, given the schedules committed, and the book is not yet complete. I plan to revisit the book in the coming year or two. I wish to thank my wife Meera for her support and encouragement of me in "writer" mode. Big thanks to CM for being so encouraging and helping me out. I also want to thank BPB publications for giving me the opportunity - too bad it didn't work, and perhaps we'll cross paths again on this or other book projects!
As they say, never waste a good crisis - and I say this in the context of the book here, because the decision to terminate this project didn't come easy, since I had written three chapters and over a hundred pages in all, with code, examples and diagrams. The crisis here taught me a lot about the process of writing technical books. There are many positives for me. This book project:
- Tested my knowledge and understanding - I found myself doing a lot of research, and reading lots of papers, and buying books I knew I wanted but hadn't invested in.
- Tested my ability to communicate and write lucidly - as someone with a wordy and explanatory style, the constraints and needs of book writing made me develop new writing styles and skills
- Put me in front of a publisher! This was an interesting experience for me, as a first time author.
- Helped me build a discipline for consuming and writing that I knew I had the potential for
- Helped me understand what a good workflow for writing a technical book is. This ranged from research workflow, making jots, code, experimentation, reading research papers, validation from experts, and putting together a manuscript. This is a vital skill!
I have other ideas for writing books, for sure. In some sense, I have been authoring booklets and technical reports (although these are not quite the same thing as a book) for the different projects I'm working on.
Deep Lyapunov - Deep Learning Model Training as a Dynamical Systems Process
Recently, I have begun digging a lot into mechanistic interpretability. This is the sub-field of AI research that seeks to explain how models work. I had done some work on this, when I built a system to understand the chaotic dynamics of agent to agent conversations, earlier this year. While I have understood Lyapunov exponents in the context of dynamical systems and used these methods in dynamical systems analysis in the past, I had not applied this to the conversation that could happen between humans and AI, or one AI agent with another.
After many experiments, I found myself writing a pipeline to study how neural network weights evolve during training, using perturbation analysis and Lyapunov exponents to understand trajectory stability. The question driving this work: do small changes in initialization lead to similar or different final solutions? Understanding this helps with reproducibility, ensemble diversity, and architecture selection.
You'll below the experimental repo in which deep learning dynamics have been analyzed, and also Deep Lyapunov, a library I built and released to PyPI yesterday.
GitHub: github.com/aiexplorations/deep_learning_dynamics, github.com/aiexplorations/deep-lyapunov | Blog Post: Deep Lyapunov - Deep Learning Dynamics
PyPI: pypi.org/project/deep-lyapunov
Related Project: github.com/aiexplorations/agentic_nld
ToDACoMM: Topological Signatures of Transformer Representations
If Deep Learning Dynamics asks "how do weights evolve during training?", ToDACoMM (Topological Data Analysis Comparison of Multiple Models) asks the complementary question: "what is the shape of the representation space that training carves out?"
The project uses persistent homology to characterize transformer activations. The pipeline extracts hidden states from each layer, projects to 50 principal components, and computes Vietoris-Rips persistent homology via Ripser. The key metrics are H0 (connected components, roughly "how spread out are the clusters?") and H1 (loops, roughly "are there circular patterns in the geometry?").
The central finding is categorical: encoder and decoder architectures occupy fundamentally different topological regimes. BERT (bidirectional attention) shows an expansion ratio of 2x from embedding to final layer. GPT-2 (causal attention) shows 95x. Other decoders range from 55x (DistilGPT-2) to 694x (SmolLM2-360M). This isn't gradual variation - it's a stark divide.
The explanation follows from how attention works. BERT's bidirectional attention gives each token access to full context from layer one - representations don't need to expand because all information is already accessible. GPT-2's causal attention means each layer must encode more context than the last as the model accumulates the prefix. The 2x vs 55-694x expansion is the topological signature of this architectural difference.
Every model showed non-trivial H1 at 500 samples - there are loops in representation geometry. Whether these reflect syntactic patterns, semantic cycles, or learned positional structure remains an open question. SmolLM2-360M showed H1 total persistence of 129.52, more than 3x higher than any other model - an anomaly worth investigating.
These two projects - Deep Learning Dynamics and ToDACoMM - are complementary lenses on the same phenomenon. One measures how the optimization carves the weight space; the other measures what gets carved in activation space. Together they suggest a research direction: do divergent weight trajectories (high Lyapunov exponents) correlate with distinct topological signatures in activations?
While there is no PyPI package yet for ToDACoMM, I intend to put one together in the coming year.
GitHub: github.com/aiexplorations/todacomm | Blog Post: The Shape of Learning
What Didn't Work
The projects that failed taught me more than the ones that succeeded. They also provided me a lot of scaffolding on top of which to build further ideas and with more confidence
The TDA-DNN project, where I explore H0 and H1 homologies in the weights of deep neural networks was a really interesting and engrossing project. This allowed me to think about how the weights of a network are likely to be distributed and whether the present of some topological primitives in would help build a case for mechanistic interpretability of the same. The persistence of homologies led to a hypothesis that I explored unfruitfully. The project did provide interesting opportunities for improvement, and a scaffolding to build ToDACoMM on top of, and so this was ultimately helpful and instructive.
Building with Praval was another exciting learning exercise. Elsewhere on this blog, I have described and characterized some of the issues I encountered there, but in a nutshell, building your own framework and an application around it teaches you a lot about how you set up the foundations and how to ensure you don't mess those up. Adding over a thousand tests, fixing async race conditions and many framework level bugs were important milestones in my own growth technically. They were great instructors in how to effectively manage technical debt. Praval Code, my failed experiment of using the Praval Multi-Agent framework may be resurrected in future with more confidence thanks to the lessons learned from my earlier experiences this year.
What I Learned
On Research Direction
Rich Sutton's Bitter Lesson has been on my mind all year: general methods that leverage computation tend to beat clever, hand-designed approaches. This doesn't mean cleverness is worthless, but it suggests where to place bets. I've been trying to internalize this, to resist the temptation of elegant solutions that don't scale. Especially with ToDACoMM and Tlon mathematics, the bitter lesson was an evident and direct dictum worth remembering.
I've also learned that dynamical systems knowledge remains underutilized in practical AI. Koopman operators, Lyapunov analysis, phase space methods; these tools have been developed over decades in physics and applied mathematics, but their application to neural networks and AI systems is still sparse. There may be opportunities there.
I think there is a case to be made for a mathematics first approach to building deep learning models, in which text based next token prediction works alongside numerical quantity prediction. As Petar Velichkovich says in a recent interview, we literally have models that perform hundreds or thousands of multiplications internally but cannot multiply numbers put into the input as part of the context window. This paradox has to be solved if deep learning in its current form has to evolve to become more useful. We seem to have cracked at least some of the code of how code generation agents can be built, but we are yet to crack the core of how mathematics can be done with deep learning models natively.
On Containers
The environment you work in matters as much as the work itself. I've started calling this "choosing the right container." A misaligned container, wrong role, wrong company, wrong domain, dampens even the best ideas. Looking back at my career, the periods of highest productivity coincided with containers that gave me latitude to explore. And the prestige of the position automatically came - in an interesting way, this seems to mirror a belief in Hindu spirituality, where Goddess Lakshmi, the Goddess of Wealth, arrives once Goddess Saraswati, the Goddess of Learning, has already planted herself. I think that working with curiosity is a superpower, and working diligently and with attention to detail without expectation of a reward is a reward.
On Curiosity and Accomplishment
This year forced me to confront a pattern in my projects: I start with genuine intellectual curiosity, go deep, then lose interest when comparing to genuinely ground-breaking research that has perhaps taken entire teams and years of expertise to unlock. Such a comparison itself is flawed, for clear and evident reasons. When it comes to these personal projects, I'm just a guy with a powerful AI coding assistant, and some spare time in which to explore big ideas, which makes this all very worth doing, when you frame it that way!
Approaching ideas with a sense of genuine curiosity, and taking the time to read up about things, structure my thoughts, and be factual and consistent before I build (even if we have the tools for getting past some of) will come back to be important once again. Outsourcing one's thinking, as many techies have become comfortable doing in 2025, is something I have begun to grow out of for the important projects. Sometimes this learning and exploration are indistinguishable - with proofs of Tlon primitives I attempted a month ago, for example, the learning is in the doing.
Another thing about curiosity. There is now a responsibility on the part of those of us using powerful AI tools, to build impactful, new, differentiated things. This is a responsibility because if we use tools that are super powerful in specific ways to do basic things or things not meant for the tool, we're likely to be eventually disappointed with the results.
Tiny Projects are Powerful
Claude Code has enabled rapid prototyping of ideas for millions of developers worldwide, and I'm just one more individual that's benefited from this glut of reasoning and agentic coding power.
During the last year, I experimented with many small projects that don't get a mention here. The pattern I saw here with such rapid prototyping is roughly as follows:
- Inspiration: I get inspired by a conversation, a podcast, book or another project
- Exploration: Use Claude to quickly spin up a working project, having explored some fundamental ideas related to the subject in question.
- Flare and Focus: Discard the ones that don't show promise, and expand on those which do.
- Publish and post: Generally, the flaring portion of promising projects leads to some kind of published work. It could be a library or a package, or a post or a paper.
Overall, tiny projects that are centred on only inspiration and exploration are supremely powerful. They help understand the deeper motivations that you have in a project, more than anything.
Looking Forward
The technical threads running through 20+ years of work remain the following sometimes intersecting things - engineering, mathematical structure, statistical and computational problem solving, dynamics, optimization, complex systems, and perhaps more. What changes often is the frontier where that intersection is explored.
For 2026, my priorities on the personal projects front include continuing development on ToDACoMM (adding training dynamics tracking and larger model analysis), preparing Tlön mathematics for arXiv, and deploying Praval Deep Research as my default research platform. Strategically, I'm exploring what it would mean to set up an independent research structure, finding collaborators who share the interest in mathematical foundations.
At work, there are numerous priorities that span technical, organizational and stratetic elements that I look forward to - the team and the next year of innovation there beckons.
And another piece of important work is perhaps internal. Can I maintain the posture of genuine curiosity, of learning for its own sake, while still exploring ideas and producing projects that matter? Can I let some explorations lead nowhere without judging them failures?
The projects that worked best this year weren't the ones that were only chasing impact. They were the ones that let curiosity lead, with honesty about what the evidence showed, even when it showed that I was wrong. This applied to Praval, Vajra Search, ToDACoMM, Deep Lyapunov especially.
Sometimes the learning is the point, and the results will follow.
If you want to explore any of these projects, find them at github.com/aiexplorations. I'd welcome your thoughts; reach out via the contact page or find me on LinkedIn.