Anchoring To Text Fragment — Namchee
Back to Posts

Anchoring To Text Fragment

9 minutes read

I’m pretty sure most of you are already familiar with the concept of in-page links—URLs that contain a hashbang and make the page jump to a specific marked element. Still not familiar? Well, you can try it out by clicking links inside the table of contents on the left side of this page.

What clicking the Table of Contents would do

What clicking links in the Table of Contents would do

The main use case of in-page links is to allow users to quickly navigate lengthy pages to a section of interest which in turn will increase UX and SEO-wise, increase discoverability and engagement that might increase user conversion rates.

Implementing in-page links is pretty simple, all you need to do is to mark the section of interest with id and append a #<id> to your links.

URL Fragment 101

<!-- say, you want to jump directly to this section on navigate -->
<div id="section-of-interest">
	You just got coconut mall'ed
</div>

<!-- on the same page, you can jump to this section just by hash + id -->
<a href="#section-of-interest">Wanna have some fun?</a>

<!-- on different page, append the path with hash + id -->
<a href="/cool-page#section-of-interest">Wanna have some fun?</a>

<!-- when sharing to external source, again, just append hash + id -->
https://www.your-site.com/cool-page#section-of-interest

Jumping To Text?

But let’s say one day you want to create a link that allows users to jump a specific text instead of marked sections. Well easy, just mark the text in a <span> and give it a unique id to them! Unfortunately, it doesn’t scale for the following reasons:

  • The solution only works if you have full control over the content. It doesn’t work for third-party generated content, whether from users or these days, LLMs
  • More elements means adding unnecessary complexity. Need to add words to the marked text? What if the id is invalid? Well, you have to modify the code again!

Since using standard element fragment clearly doesn’t scale well, we can try using JavaScript to solve the problem which should look similar to the following

scroll-to-text.js

js

const targetText = `coconut mall'ed`;

// let's say the text is always a descendant of this element
const container = document.querySelector('.content');

// Create a TreeWalker, which is more efficient than iterating nodes recursively
const walker = document.createTreeWalker(container, NodeFilter.SHOW_TEXT);

let node = null;

while (walker.nextNode()) {
	if (walker.currentNode.nodeValue.includes(targetText)) {
		node = walker.currentNode;
		break;
	}
}

if (node && node.parentElement) {
	// can't scroll directly to text, so we scroll to the nearest ancestor
	// which is the parent of course
	node.parentElement.scrollIntoView({ behavior: 'smooth', block: 'center' });
}

Unfortunately, this means another reliance on JavaScript. What if your user disables JavaScript? What if you want to reduce JavaScript footprint in your site? Or maybe you’ve had enough of JavaScript?

Turns out, TC39 has addressed this by introducing URL Fragment Text Directives, first introduced to the public in 2020 and finally hit stable status in 2023.

Dissecting the API

Similar to element fragment, you can insert text fragments into a URL using a hashbang with the following syntax (psst, try to interact with your pointers!):

While it does look similar to your standard element fragment, there are few key differences besides stricter format:

  • All inputs within this directive must be percent-encoded with some additional cases explained in the below section.
  • Each part of the syntax must reside inside one block-level element, however the whole match may span across multiple blocks.
  • It’s possible to define multiple text fragments by separating them with & just like what you usually do with query strings. Example
  • Instead of styling the fragment with :target pseudo-class, the text fragments are styled with ::target-text pseudo-element. By default, the browser will style text fragments with the same default style as <mark>.

When browsers received a navigation to a URL that contains text fragment(s), it will try to find texts that matches the directive even if they’re visually hidden.

If a text matching the fragment is found, the browser will scroll to and highlight the first matching fragment. If no match is found, the browser will do nothing.

Building Text Fragment

After a short introduction to the API and what its trying to solve, now you’re ready to build your first text fragment.

Things will go pretty simple if you want to build the fragment programmatically while you have full control of the content. All you have to do is encode the fragment with encodeURIComponent function, which should be available as a global JavaScript API.

However as mentioned above, you also need to handle special cases for -, &, and , characters, as they also need to be percent-encoded to avoid confusion with the text directive syntax per the spec.

Building Text Fragment Programmatically

js

function encodeTextFragment(fragment) {
	return encodeURIComponent(fragment)
		.replace(
			/[-&,]/g,
			(c) => '%' + c.charCodeAt(0).toString(16).toUpperCase()
		);
}

// Sample usage
const textPrefix = encodeTextFragment('you just got');
const textStart = encodeTextFragment(`coconut`);
const textEnd = encodeTextFragment(`mall-ed`);
const textSuffix = encodeTextFragment('share this to your friends & family!');

const baseUrl = '/cool-page';
const url = `${baseUrl}#:~:text=${textPrefix}-,${textStart},${textEnd},-${textSuffix}`;

If you don’t have control over the content or simply want an easy way to generate a text fragment without touching the markup, you can use web extensions for Chromium-based browsers, Edge, Firefox, and Safari to generate one for you.

Interaction With Element Fragment

Since text fragment occupies the same hashbang spot as element fragment, you might be wondering if it’s possible to combine them both. Actually, you can! You only need to append the text fragment directive after element fragment. For example:

#jumping-to-text:~:text=since,scale%20well

In this case, the element fragment acts as fallback for text fragment directive if the browser cannot find matching text with the text fragment directive as demonstrated by this link, where I searched the text wumbologic while falling back to this section.

However if the browser is able to find text(s) that satisfies the text fragment directive, the browser will ignore the element fragment as demonstrated by this link, where I searched the text text(s) while falling back to the previous section.

Is It A ‘Navigation’?

After reading the features and API specification above, you might think this API solves a use case you need for your app. You have played around with it in an isolated environment and finds out that it works as you expected.

You then ported the functionality to your app that uses a client-side navigation with History API which should look similar to the following:

Text fragment directive with History API. Surely nothing could go wrong!

js

const targetText = `coconut mall'ed`;

if (window.history) {
	const fragment = encodeURIComponent(targetText);
	const url = `/cool-page#:~:text=${fragment}`;

	window.history.pushState(null, '', url);
}

Unfortunately, you’ll find out that the text fragment doesn’t work and the page just navigates normally. This is an expected behavior, as the text fragment directive only triggers on user-initiated navigations.

Simply put, only navigations triggered via browser’s back-forward buttons and through HTML elements that supports navigation such as <a> and <form> that will trigger browser’s behavior of text fragment directive.

For those of you that programmatically navigates using JavaScript, your only option to replicate the text fragment directive behavior is using JavaScript.

Please consult to the Navigation API Appendix for more information about which navigation counts as user-initiated or not.

Extracting The Fragment

Suppose that you want to retrieve the text directive programmatically, either for analytical purposes or to alter specific scrolling behavior, such as disabling scroll-to-bottom when a text fragment directive is present.

Well, since the directives are stored in hash bangs, we should be able to read the hash property from URL and parse it from there no?

Extracting text fragment directive that definitely don't work

js

// suppose that the the current URL is `/cool-page#:~:text=xxx`
const hash = window.location.hash;
const textFragment = hash.match(/:~:text=(.+):/);

if (textFragment) {
	const fragments = textFragment[1]
		.split('&') // account for possible multi-fragment
		.map((fragment) => decodeURIComponent(fragment));

	console.log(fragments); // it's empty?
}

Unfortunately, you can’t retrieve text fragments from hash as per the WICG specification, text fragment directives should be hidden and won’t be exposed to user agents.

This behavior is made to maintain compatibility with hash-based navigations used in old client-side navigation library (old React Router anyone?). While there is acknowledged need of it, it’s still not progressing anywhere.

At the time of writing, there’s an acknowledged bug in Chromium-based browsers that allows you to retrieve text fragment directive by exploiting the Performance API, specifically by parsing PerformanceNavigationTiming interface.

Extracting text fragment directive that works (on Chromium browsers)

js

const navigationEvent = performance
	.getEntriesByType('navigation')[0];

// Should be `PerformanceNavigationTiming`
const hash = new URL(navigationEvent.name).hash;
// Match until the next possible directive
const textFragment = hash.match(/:~:text=(.+):/);

if (textFragment) {
  const fragments = textFragment[1]
  	.split('&') // account for possible multi-fragment
  	.map((fragment) => decodeURIComponent(fragment));

  console.log(fragments); // it's NOT empty 🎉
}

But as a good developer, you shouldn’t rely on a bug even if you’re only targeting Chromium-based browsers.

Security & Privacy

Text fragment directive can be created by anyone who has access to your website so it’s a reasonable concern to consider its security and privacy implications.

While in the previous section I wrote that text fragment directive works on user-activated navigations, there is one more requirement that the directive should fulfill: cross-origin requests should happen in noopener context which ensures the target page to be isolated.

This context requirements ensures that the directive won’t be invoked from a <iframe> navigation and texts won’t be searched inside an <iframe>. While text fragment directive does look secure from a glance, the directive does open a new attack vector related with scrolling.

This possible attack vector has raised privacy concerns, leading Brave to temporarily disable this directive which has been re-enabled on version 1.78 which should be released by the time this post goes public.

If you’re in full-control of your site, you can tell the browser to not process any fragments site-wide by setting the Document-Policy header with the following value.

Enforce the page to ignore fragments including element fragment

# ⚠️ Warning! This disables element fragment too!
Document-Policy: force-load-at-top

For enterprise environments that uses Chrome, network administrators may also disable text fragments by setting the ScrollToTextFragmentPolicy policy.

Final Thoughts

First implemented on Google Search in 2020, the text fragment directive has provided a better way to share higher quality references via URLs by allowing direct links to arbitrary text anywhere on a web page, compared to the element fragment.

At the same time, text fragments let users share references in a less restrictive way, as they are not as dependent on how much control the link creator has over the content’s structure, giving more freedom to the users.

With these advantages in mind and the ease of generating them using browser plugins, I hope that you will consider adopting text fragment directives as a higher-quality alternative for providing references and citations. Happy (deep)linking!

Share this Post

Tags

Web
Javascript