They can't be doing this. What if just opening the link does a destructive actio...

oh_sigh · on May 12, 2015

Then said link is in violation of web standards that have existed for literally decades. I believe I've heard stories about google bots deleting entire forums, because the forums performed destructive actions via GET calls to specific URLs.

If what you said makes sense to you, ask yourself how google can crawl any URL at all, considering communicating anything to any server could trigger a destructive action.

ryanbrunner · on May 12, 2015

I'm curious how one-click unsubscribes continue to function, then. Wouldn't those need to be GET requests (due to being a link)?

dankohn1 · on May 12, 2015

Take a look. All modern unsubscribes are a get followed by a post for exactly this reason.

toomuchtodo · on May 12, 2015

Today I learned!

frankzinger · on May 13, 2015

Beware, one of the unwritten laws of HN is that any post containing anything vaguely reminiscent of reddit gets downvoted. Upvoted in order to pre-empt the inevitable downvote.

toomuchtodo · on May 13, 2015

Thanks! I wasn't attempting to bring something reddit-esq to HN, I simply enjoy learning new tidbits everyday on HN!

ZainRiz · on May 13, 2015

Would you consider an unsubscribe page that used javascript to automatically make a PUT/POST unsubscribe call on pageload bad practice then?

The google-bots wouldn't affect it since they don't run javascript

nikaspran · on May 13, 2015

Google seems to be running JavaScript for a while now:

I.e.: http://searchengineland.com/tested-googlebot-crawls-javascri...

svieira · on May 12, 2015

They do and they are (at least where I have worked) but it seems that Google et. alia are smart enough to not follow links marked containing text with the value "Unsubscribe".

dragonwriter · on May 12, 2015

> What if just opening the link does a destructive action, like unsubscribing or posting a comment on your website?

Then whoever created the link that does that is doing web wrong and needs to stop.

GET is, by definition, a safe method. (See RFC 7231, Sec. 4.2.1; RFC 2616, Sec. 9.1.1.) Doing destructive actions via a safe method is plain wrong.

vincentkriek · on May 12, 2015

Being wrong is very different from not existing. As long as a spec is a guideline there are people doing it wrong, I guarantee you that.

There are links that do that and if google were to follow every linkt people would notice very fast.

gwillen · on May 12, 2015

Google does follow every link, and those people do notice, because their forums get deleted. Very few people do it this wrong for very long.

(Or they get fed up and patch the problem without understanding it, by using robots.txt to block all crawlers from their site.)

superuser2 · on May 12, 2015

Idempotency of HTTP GET requests is not a theoretical concern.

dragonwriter · on May 12, 2015

> Idempotency of HTTP GET requests is not a theoretical concern.

That's true, but the relevant feature here is safety, not idempotence. (Though a safe method is also idempotent, not all idempotent methods are safe.)

ipsum2 · on May 12, 2015

Idempotence has nothing to do with it. POSTs are also Idempotent.

dragonwriter · on May 12, 2015

> Idempotence has nothing to do with it. POSTs are also Idempotent.

No, of the "base" HTTP/1.1 methods, all safe methods (GET, HEAD, OPTIONS, TRACE) and some unsafe methods (PUT and DELETE) are idempotent.

POST is neither safe nor idempotent (and safety is the key feature here, rather than idempotence.)

GET, HEAD, and POST are cacheable methods, which you may be confusing with idempotent methods. These are very different categories, however.

extra88 · on May 12, 2015

POSTs are not by definition idempotent. You can make a server response to a POST be idempotent but when you want multiple identical requests to have different effects, POST is the method to use (vs. GET, PUT, DELETE, etc.)

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

ble · on May 12, 2015

From

http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

9.1.1 Safe Methods

Implementors should be aware that the software represents the user in their interactions over the Internet, and should be careful to allow the user to be aware of any actions they might take which may have an unexpected significance to themselves or others.

In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.

Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.