Journal tags: h-entry

3

Reading

At the beginning of the year, Remy wrote about extracting Goodreads metadata so he could create his end-of-year reading list. More recently, Mark Llobrera wrote about how he created a visualisation of his reading history. In his case, he’s using JSON to store the information.

This kind of JSON storage is exactly what Tom Critchlow proposes in his post, Library JSON - A Proposal for a Decentralized Goodreads:

Thinking through building some kind of “web of books” I realized that we could use something similar to RSS to build a kind of decentralized GoodReads powered by indie sites and an underlying easy to parse format.

His proposal looks kind of similar to what Mark came up with. There’s a title, an author, an image, and some kind of date for when you started and/or finished reading the book.

Matt then points out that RSS gets close to the data format being suggested and asks how about using RSS?:

Rather than inventing a new format, my suggestion is that this is RSS plus an extension to deal with books. This is analogous to how the podcast feeds are specified: they are RSS plus custom tags.

Like Matt, I’m in favour of re-using existing wheels rather than inventing new ones, mostly to avoid a 927 situation.

But all of these proposals—whether JSON or RSS—involve the creation of a separate file, and yet the information is originally published in HTML. Along the lines of Matt’s idea, I could imagine extending the h-entry collection of class names to allow for books (or films, or other media). It already handles images (with u-photo). I think the missing fields are the date-related ones: when you start and finish reading. Those fields are present in a different microformat, h-event in the form of dt-start and dt-end. Maybe they could be combined:


<article class="h-entry h-event h-review">
<h1 class="p-name p-item">Book title</h1>
<img class="u-photo" src="image.jpg" alt="Book cover.">
<p class="p-summary h-card">Book author</p>
<time class="dt-start" datetime="YYYY-MM-DD">Start date</time>
<time class="dt-end" datetime="YYYY-MM-DD">End date</time>
<div class="e-content">Remarks</div>
<data class="p-rating" value="5">★★★★★</data>
<time class="dt-published" datetime="YYYY-MM-DDThh:mm">Date of this post</time>
</article>

That markup is simultaneously a post (h-entry) and an event (h-event) and you can even throw in h-card for the book author (as well as h-review if you like to rate the books you read). It can be converted to RSS and also converted to .ics for calendars—those parsers are already out there. It’s ready for aggregation and it’s ready for visualisation.

I publish very minimal reading posts here on adactio.com. What little data is there isn’t very structured—I don’t even separate the book title from the author. But maybe I’ll have a little play around with turning these h-entries into combined h-entry/event posts.

Webmentions at Indie Web Camp Berlin

I was in Berlin for most of last week, and every day was packed with activity:

Indie Web Camp on Saturday and Sunday,
Accessibility Club on Monday,
Beyond Tellerrand on Tuesday and Wednesday.

By the time I got back to Brighton, my brain was full …just in time for FF Conf.

All of the events were very different, but equally enjoyable. It was also quite nice to just attend events without speaking at them.

Indie Web Camp Berlin was terrific. There was an excellent turnout, and once again, I found that the format was just right: a day of discussions (BarCamp style) followed by a day of doing (coding, designing, hacking). I got very inspired on the first day, so I was raring to go on the second.

What I like to do on the second day is try to complete two tasks; one that’s fairly straightforward, and one that’s a bit tougher. That way, when it comes time to demo at the end of the day, even if I haven’t managed to complete the tougher one, I’ll still be able to demo the simpler one.

In this case, the tougher one was also tricky to demo. It involved a lot of invisible behind-the-scenes plumbing. I was tweaking my webmention endpoint (stop sniggering—tweaking your endpoint is no laughing matter).

Up until now, I could handle straightforward webmentions, and I could handle updates (if I receive more than one webmention from the same link, I check it each time). But I needed to also handle deletions.

The spec is quite clear on this. A 404 isn’t enough to trigger a deletion—that might be a temporary state. But a status of 410 Gone indicates that a resource was once here but has since been deliberately removed. In that situation, any stored webmentions for that link should also be removed.

Anyway, I think I got it working, but it’s tricky to test and even trickier to demo. “Not to worry”, I thought, “I’ve always got my simpler task.”

For that, I chose to add a little map to my homepage showing the last location I published something from. I’ve been geotagging all my content for years (journal entries, notes, links, articles), but not really doing anything with that data. This is a first step to doing something interesting with many years of location data.

I’ve got it working now, but the demo gods really weren’t with me at Indie Web Camp. Both of my demos failed. The webmention demo failed quite embarrassingly.

As well as handling deletions, I also wanted to handle updates where a URL that once linked to a post of mine no longer does. Just to be clear, the URL still exists—it’s not 404 or 410—but it has been updated to remove the original link back to one of my posts. I know this sounds like another very theoretical situation, but I’ve actually got an example of it on my very first webmention test post from five years ago. Believe it or not, there’s an escort agency in Nottingham that’s using webmention as a vector for spam. They post something that does link to my test post, send a webmention, and then remove the link to my test post. I almost admire their dedication.

Still, I wanted to foil this particular situation so I thought I had updated my code to handle it. Alas, when it came time to demo this, I was using someone else’s computer, and in my attempt to right-click and copy the URL of the spam link …I accidentally triggered it. In front of a room full of people. It was midly NSFW, but more worryingly, a potential Code Of Conduct violation. I’m very sorry about that.

Apart from the humiliating demo, I thoroughly enjoyed Indie Web Camp, and I’m going to keep adjusting my webmention endpoint. There was a terrific discussion around the ethical implications of storing webmentions, led by Sebastian, based on his epic post from earlier this year.

We established early in the discussion that we weren’t going to try to solve legal questions—like GDPR “compliance”, which varies depending on which lawyer you talk to—but rather try to figure out what the right thing to do is.

Earlier that day, during the introductions, I quite happily showed webmentions in action on my site. I pointed out that my last blog post had received a response from another site, and because that response was marked up as an h-entry, I displayed it in full on my site. I thought this was all hunky-dory, but now this discussion around privacy made me question some inferences I was making:

By receiving a webention in the first place, I was inferring a willingness for the link to be made public. That’s not necessarily true, as someone pointed out: a CMS could be automatically sending webmentions, which the author might be unaware of.
If the linking post is marked up in h-entry, I was inferring a willingness for the content to be republished. Again, not necessarily true.

That second inferrence of mine—that publishing in a particular format somehow grants permissions—actually has an interesting precedent: Google AMP. Simply by including the Google AMP script on a web page, you are implicitly giving Google permission to store a complete copy of that page and serve it from their servers instead of sending people to your site. No terms and conditions. No checkbox ticked. No “I agree” button pressed.

Just sayin’.

Anyway, when it comes to my own processing of webmentions, I’m going to take some of the suggestions from the discussion on board. There are certain signals I could be looking for in the linking post:

Does it include a link to a licence?
Is there a restrictive robots.txt file?
Are there meta declarations that say noindex?

Each one of these could help to infer whether or not I should be publishing a webmention or not. I quickly realised that what we’re talking about here is an algorithm.

Despite its current usage to mean “magic”, an algorithm is a recipe. It’s a series of steps that contribute to a decision point. The problem is that, in the case of silos like Facebook or Instagram, the algorithms are secret (which probably contributes to their aura of magical thinking). If I’m going to write an algorithm that handles other people’s information, I don’t want to make that mistake. Whatever steps I end up codifying in my webmention endpoint, I’ll be sure to document them publicly.

Indie web building blocks

I was back in Nürnberg last week for the second border:none. Joschi tried an interesting format for this year’s event. The first day was a small conference-like gathering with an interesting mix of speakers, but the second day was much more collaborative, with people working together in “creator units”—part workshop, part round-table discussion.

I teamed up with Aaron to lead the session on all things indie web. It turned out to be a lot of fun. Throughout the day, we introduced the little building blocks, one by one. By the end of the day, it was amazing to see how much progress people made by taking this layered approach of small pieces, loosely stacked.

relme

The first step is: do you have a domain name?

Okay, next step: are you linking from that domain to other profiles of you on the web? Twitter, Instagram, Github, Dribbble, whatever. If so, here’s the first bit of hands-on work: add rel="me" to those links.

<a rel="me" href="https://twitter.com/adactio">Twitter</a>
<a rel="me" href="https://github.com/adactio">Github</a>
<a rel="me" href="https://www.flickr.com/people/adactio">Flickr</a>

If you don’t have any profiles on other sites, you can still mark up your telephone number or email address with rel="me". You might want to do this in a link element in the head of your HTML.

<link rel="me" href="mailto:jeremy@adactio.com" />
<link rel="me" href="sms:+447792069292" />

IndieAuth

As soon as you’ve done that, you can make use of IndieAuth. This is a technique that demonstrates a recurring theme in indie web building blocks: take advantage of the strengths of existing third-party sites. In this case, IndieAuth piggybacks on top of the fact that many third-party sites have some kind of authentication mechanism, usually through OAuth. The fact that you’re “claiming” a profile on a third-party site using rel="me"—and the third-party profile in turn links back to your site—means that we can use all the smart work that went into their authentication flow.

You can see IndieAuth in action by logging into the Indie Web Camp wiki. It’s pretty nifty.

If you’ve used rel="me" to link to a profile on something like Twitter, Github, or Flickr, you can authenticate with their OAuth flow. If you’ve used rel="me" for your email address or phone number, you can authenticate by email or SMS.

h-entry

Next question: are you publishing stuff on your site? If so, mark it up using h-entry. This involves adding a few classes to your existing markup.

<article class="h-entry">
  <div class="e-content">
    <p>Having fun with @aaronpk, helping @border_none attendees mark up their sites with rel="me" links, h-entry classes, and webmention endpoints.</p>
  </div>
  <time class="dt-published" datetime="2014-10-18 08:42:37">8:42am</time>
</article>

Now, the reason for doing this isn’t for some theoretical benefit from search engines, or browsers, but simply to make the content you’re publishing machine-parsable (which will come in handy in the next steps).

Aaron published a note on his website, inviting everyone to leave a comment. The trick is though, to leave a comment on Aaron’s site, you need to publish it on your own site.

Webmention

Here’s my response to Aaron’s post. As well as being published on my own site, it also shows up on Aaron’s. That’s because I sent a webmention to Aaron.

Webmention is basically a reimplementation of pingback, but without any of the XML silliness; it’s just a POST request with two values—the URL of the origin post, and the URL of the response.

My site doesn’t automatically send webmentions to any links I reference in my posts—I should really fix that—but that’s okay; Aaron—like me—has a form under each of his posts where you can paste in the URL of your response.

This is where those h-entry classes come in. If your post is marked up with h-entry, then it can be parsed to figure out which bit of your post is the body, which bit is the author, and so on. If your response isn’t marked up as h-entry, Aaron just displays a link back to your post. But if it is marked up in h-entry, Aaron can show the whole post on his site.

Okay. By this point, we’ve already come really far, and all people had to do was edit their HTML to add some rel attributes and class values.

For true site-to-site communication, you’ll need to have a webmention endpoint. That’s a bit trickier to add to your own site; it requires some programming. Here’s my minimum viable webmention that I wrote in PHP. But there are plenty of existing implentations you can use, like this webmention plug-in for WordPress.

Or you could request an account on webmention.io, which is basically webmention-as-a-service. Handy!

Once you have a webmention endpoint, you can point to it from the head of your HTML using a link element:

<link rel="mention" href="https://adactio.com/webmention" />

Now you can receive responses to your posts.

Here’s the really cool bit: if you sign up for Bridgy, you can start receiving responses from third-party sites like Twitter, Facebook, etc. Bridgy just needs to know who you are on those networks, looks at your website, and figures everything out from there. And it automatically turns the responses from those networks into h-entry. It feels like magic!

Here are responses from Twitter to my posts, as captured by Bridgy.

POSSE

That was mostly what Aaron and I covered in our one-day introduction to the indie web. I think that’s pretty good going.

The next step would be implementing the idea of POSSE: Publish on your Own Site, Syndicate Elsewhere.

You could do this using something as simple as If This, Then That e.g. everytime something crops up in your RSS feed, post it to Twitter, or Facebook, or both. If you don’t have an RSS feed, don’t worry: because you’re already marking your HTML up in h-entry, it can be converted to RSS easily.

I’m doing my own POSSEing to Twitter, which I’ve written about already. Since then, I’ve also started publishing photos here, which I sometimes POSSE to Twitter, and always POSSE to Flickr. Here’s my code for posting to Flickr.

I’d really like to POSSE my photos to Instagram, but that’s impossible. Instagram is a data roach-motel. The API provides no method for posting photos. The only way to post a picture to Instagram is with the Instagram app.

My only option is to do the opposite of POSSEing, which is PESOS: Publish Elsewhere, and Syndicate to your Own Site. To do that, I need to have an endpoint on my own site that can receive posts.

Micropub

Working side by side with Aaron at border:none inspired me to finally implement one more indie web building block I needed: micropub.

Having a micropub endpoint here on my own site means that I can publish from third-party sites …or even from native apps. The reason why I didn’t have one already was that I thought it would be really complicated to implement. But it turns out that, once again, the trick is to let other services do all the hard work.

First of all, I need to have something to manage authentication. Well, I already have that with IndieAuth. I got that for free just by adding rel="me" to my links to other profiles. So now I can declare indieauth.com as my authorization endpoint in the head of my HTML:

<link rel="authorization_endpoint" href="https://indieauth.com/auth" />

Now I need some way of creating and issuing authentation tokens. See what I mean about it sounding like hard work? Creating a token endpoint seems complicated.

But once again, someone else has done the hard work so I don’t have to. Tokens-as-a-service:

<link rel="token_endpoint" href="https://tokens.indieauth.com/token" />

The last piece of the puzzle is to point to my own micropub endpoint:

<link rel="micropub" href="https://adactio.com/micropub" />

That URL is where I will receive posts from third-party sites and apps (sent through a POST request with an access token in the header). It’s up to me to verify that the post is authenticated properly with a valid access token. Here’s the PHP code I’m using.

It wasn’t nearly as complicated as I thought it would be. By the time a post and a token hits the micropub endpoint, most of the hard work has already been done (authenticating, issuing a token, etc.). But there are still a few steps that I have to do:

Make a GET request (I’m using cURL) back to the token endpoint I specified—sending the access token I’ve been sent in a header—verifying the token.
Check that the “me” value that I get back corresponds to my identity, which is https://adactio.com
Take the h-entry values that have been sent as POST variables and create a new post on my site.

I tested my micropub endpoint using Quill, a nice little posting interface that Aaron built. It comes with great documentation, including a guide to creating a micropub endpoint.

It worked.

Here’s another example: Ben Roberts has a posting interface that publishes to micropub, which means I can authenticate myself and post to my site from his interface.

Finally, there’s OwnYourGram, a service that monitors your Instagram account and posts to your micropub endpoint whenever there’s a new photo.

That worked too. And I can also hook up Bridgy to my Instagram account so that any activity on my Instagram photos also gets sent to my webmention endpoint.

Indie Web Camp

Each one of these building blocks unlocks greater and greater power:

Each one of those building blocks you implement unlocks more and more powerful tools:

But its worth remembering that these are just implementation details. What really matters is that you’re publishing your stuff on your website. If you want to use different formats and protocols to do that, that’s absolutely fine. The whole point is that this is the independent web—you can do whatever you please on your own website.

Still, if you decide to start using these tools and technologies, you’ll get the benefit of all the other people who are working on this stuff. If you have the chance to attend an Indie Web Camp, you should definitely take it: I’m always amazed by how much is accomplished in one weekend.

Some people have started referring to the indie web movement. I understand where they’re coming from; it certainly looks like a “movement” from the outside, and if you attend an Indie Web Camp, there’s a great spirit of sharing. But my underlying motivations are entirely selfish. In the same way that I don’t really care about particular formats or protocols, I don’t really care about being part of any kind of “movement.” I care about my website.

As it happens, my selfish motivations align perfectly with the principles of an indie web.