Obsidian Launches Defuddle, Taking Obsidian Web Clipper to New Heights
Obsidian Launches Defuddle, Taking Obsidian Web Clipper to New Heights
I have always loved Obsidian's core philosophy: local first, everything is a file, and it is simply plain Markdown text files. In this model, notes completely belong to us, and we can freely combine various components or plugins to customize operations and workflows according to our habits. Moreover, the preservation, backup, and synchronization of information are all under our control.
I previously introduced the Web Clipper launched by Obsidian, a browser plugin for web clipping that also adheres to the aforementioned "file-centric" philosophy, turning the content of the web pages we are browsing into a Markdown note stored in the Obsidian vault, and it also includes metadata.
Recently, Obsidian launched a new website, Defuddle.md. Defuddle is a very powerful tool in the Obsidian local file ecosystem, simply put, it is the web version of Obsidian Web Clipper.
If we consider Obsidian as an OS for local notes, we have previously introduced the CLI (command line interface), and Defuddle is more like a URL interface for Obsidian Web Clipper.
So, before diving into Defuddle, let's quickly review Web Clipper.
Obsidian Web Clipper: Web Pages to Markdown
When it comes to Obsidian Web Clipper, I personally think its most attractive feature is that it has long surpassed traditional web saving and capturing functions.
It not only captures web pages but also supports extremely flexible Obsidian templates. The Web Clipper can extract various metadata from web pages (such as author, publication time, and even specific page elements). What surprised me even more is that it now supports conditional logic and loops. This means that during the web collection phase, we can organize the content according to our own rules and directly turn it into clean, structured local Markdown files.
Of course, the downside is that if users do not like the default organization template, they need to configure it themselves, which may have some barriers (actually, AI can help with this).
But overall, Obsidian Clipper perfectly fits into the Obsidian ecosystem, allowing web content to smoothly integrate into our personal knowledge base. For power users, it also provides space for tinkering. Do not underestimate these power users; many Obsidian plugins are created by users themselves.
Defuddle.md: A Focused Extraction Layer
The emergence of Defuddle makes me feel that Obsidian has opened up the core web extraction capabilities behind Web Clipper separately.
If you are a collector of information in a certain field, a researcher, a data analyst... if you can tinker a bit or leverage AI (including the recently popular small lobster) to tinker a bit, then Defuddle.md is definitely a surprise!
Give Defuddle a link, and it can help you clean up ads, recommendation areas, and other cluttered elements on the web page, trying to extract clean main text and structured metadata. It acts like a purifier specifically responsible for converting complex web pages into standard Markdown text. With this, anyone can create their own Web Clipper without being tied to Obsidian.
Defuddle offers several access methods.
- Regular users can access it via a browser, input the web page URL to see the cleaned HTML or Markdown.
- AI agents or developers can use the URL interface to obtain cleaned conversion results.
- Obsidian users can directly use the Web Clipper plugin (which is powered by Defuddle).
Moreover, importantly, Defuddle is open-source. We can even deploy it locally. This is very Obsidian-like. It allows the app to become a system that users can control, rather than turning users into appendages of the app.
Conclusion
In the market, there are actually many clipping plugins, many of which aim to lock users into a specific app. However, Obsidian Web Clipper is different because Obsidian's philosophy is File Over App. Now, further, Defuddle has been opened up, and everyone can use it directly, and it is also open-source. So, it feels a bit too extravagant, which might not be suitable.
If you are interested in this type of web scraping (as an important part of PKM), Jina.ai previously also offered a paid Reader API (see image below). Jina is the first choice for many AI practitioners, and now it feels like there is a free open-source competitor. Of course, Jina claims to be an AI model for cleaning and scraping, which is slightly different.
To be honest, I am now starting to worry that Defuddle might be abused (or blocked).

