Matching Obsidian links with Regex
Matching Obsidian links with Regex
Links to match
To match the basic types1 of links in Obsidian, we need a Regex that captures the below:
- Simple Markdown links (
[[Obsidian]]
) - Markdown links with alias (
[[PKM app|Obsidian]]
) - File embeds (
![[obsidian-banner.png]]
) - File embeds with aliases (
![[Obsidian Banner|obsidian-banner.png]]
)
Regex that matches links
This regular expression captures all link types mentioned before:
const mdReg = /(!)?\[\[(?:(.+?)\|)?(.+?)\]\]/g;
Let’s break that down:
(!)?
(Optional capture group): Captures the!
character before a link. If there is none, returnsnull
.\[\[
: Two open square brackets. The[
character needs to be escaped with a backslash in regular expressions.(?:(.+?)\|)?
(Optional capture group): Matches characters between the open brackets and a pipe|
. This is the alias of a link or embed. If no match, returnsnull
.(.+?)
(Capture group): Captures all characters within double square brackets. This matches the text within[[ ]]
.\]\]
: Closing square brackets (need to be escaped with\
).
You can test this RegEx and see a visual explanation on Regexr.
Using the Regex in JavaScript
Using JavaScript, we can then iterate over those matches:
const content = "…"; // A string with some links in it
const mdRegex = /(!)?\[\[(?:(.+?)\|)?(.+?)\]\]/g;
const matches = content.matchAll(mdRegex);
for (const match of matchesMd) {
const embedPrefix = match[1]; // -> "!" or `null`
const alt = match[2]; // -> alt text or `null`
const link = match[3]; // -> link text or `null`
console.log([embedPrefix, alt, link]);
}
This is the output we get when replacing the content
variable with different types of Markdown:
Matching a simple link
const content = "[[Get started with Obsidian]]";
// … running the match of matchesMd loop
// -> [null,null,"Get started with Obsidian"]
Matching a link with alias
const content =
"[[From plain-text note-taking|I have used plain-text based apps]]";
// … running the match of matchesMd loop
// -> [null,"From plain-text note-taking","I have used plain-text based apps"]
Matching an Embed
const content = "![[obsidian-banner.png]]";
// … running the match of matchesMd loop
// -> ["!",null,"obsidian-banner.png"]
Matching an Embed with alias
const content = "![[screenshot of an obsidian graph|obsidian-graph.png]]";
// -> ["!","screenshot of an obsidian graph","obsidian-graph.png"]
Function to return Obsidian links from text
That’s pretty good already!
But we can make this even more convinient and powerful by moving the logic into a reusable function. When we call getMdMatches()
and pass it a string, the function returns an object for each match.
const getMdMatches = (content) => {
const mdReg = /(!)?\[\[(?:(.+?)\|)?(.+?)\]\]/g;
const matchesMd = content.matchAll(mdReg);
const matchesArr = [...matchesMd];
const outputArray = matchesArr.map((match) => ({
link: match[3],
alt: match[2] ?? "",
isEmbed: Boolean(match[1]),
original: match[0],
}));
return outputArray;
};
Example output of the function
const content = `[[Get started with Obsidian]]
[[From plain-text note-taking|I have used plain-text based apps]]
![[obsidian-banner.png]]
![[screenshot of an obsidian graph|obsidian-graph.png]]`;
const mdMatches = getMdMatches(content);
console.log(mdMatches);
// [
// {
// "link": "Get started with Obsidian",
// "alt": "",
// "isEmbed": false,
// "original": "[[Get started with Obsidian]]"
// },
// {
// "link": "I have used plain-text based apps",
// "alt": "From plain-text note-taking",
// "isEmbed": false,
// "original": "[[From plain-text note-taking|I have used plain-text based apps]]"
// },
// {
// "link": "obsidian-banner.png",
// "alt": "",
// "isEmbed": true,
// "original": "![[obsidian-banner.png]]"
// },
// {
// "link": "obsidian-graph.png",
// "alt": "screenshot of an obsidian graph",
// "isEmbed": true,
// "original": "![[screenshot of an obsidian graph|obsidian-graph.png]]"
// }
// ]
The function works with any amount of text in between links:
const content = `First of all, tell me a little bit about what's your experience with note-taking apps like?
-> [[No prior experience|I have no prior experience]]
-> [[From standard note-taking|I’ve used note-taking apps like Evernote and OneNote]]
-> [[From plain-text note-taking|I have used plain-text based apps]]
![[obsidian-banner.png]]
→ [[Get started with Obsidian]]`;
const mdMatches = getMdMatches(content);
console.log(mdMatches);
// [
// {
// "link": "I have no prior experience",
// "alt": "No prior experience",
// "isEmbed": false,
// "original": "[[No prior experience|I have no prior experience
// ]]"
// },
// {
// "link": "I’ve used note-taking apps like Evernote and OneNote
// ",
// "alt": "From standard note-taking",
// "isEmbed": false,
// "original": "[[From standard note-taking|I’ve used note-takin
// g apps like Evernote and OneNote]]"
// },
// {
// "link": "I have used plain-text based apps",
// "alt": "From plain-text note-taking",
// "isEmbed": false,
// "original": "[[From plain-text note-taking|I have used plain-
// text based apps]]"
// },
// {
// "link": "obsidian-banner.png",
// "alt": "",
// "isEmbed": true,
// "original": "![[obsidian-banner.png]]"
// },
// {
// "link": "Get started with Obsidian",
// "alt": "",
// "isEmbed": false,
// "original": "[[Get started with Obsidian]]"
// }
// ]
Use case: Transforming text for blog posts
I write my blog posts in Obsidian and then move them into a Nuxt site. There are a few transformations that I need to apply to make Obsidian’s markdown work with Nuxt.
- Internal links (
[[…]]
) are replaced with the equivalent public note - Embeds (
![[…]]
) are replaced with the Markdown syntax (![alt](link)
) and the files are moved to the blog folder.
Using getMdMatches()
makes it easy to match all links, format them in the way I need and replace the original link.
Beyond that, there may be many use cases where it is helpful to extract Obsidian links and embeds from a body of text. The regex and functions mentioned in this post provide a reliable and convenient way to transform your notes to your heart’s desire!
Further reading
- Spread syntax (…) - JavaScript | MDN
- Regular expressions - JavaScript | MDN
- String.prototype.matchAll() - JavaScript | MDN
Footnotes
-
This list doesn’t include heading links (
#
) or block references (^
). I don’t use them often enough. 🤷♂️ ↩