Quora Keeps the World's Knowledge For Itself

published by Eric Mill on
The Stockholm Public Library's proposed Wall of Knowledge.

I recently made the mistake of answering a question on Quora: "What is it like to be a member of the 18F team?"

I have a Quora account, but I've almost never used it. I noticed this question because the user somehow found me and "asked" me to answer it, which caused Quora to email me. It's a fine question, the kind of question Quora excels at getting answered, and I think the answers it produced will stand as an accurate snapshot of how the early 18F team feels about their job and their work.

Sadly, Quora's policy is to lock that snapshot in their own private store, and out of the historical record.

The Internet Archive is one of the world's most fantastical organizations, housed in a majestic temple, and is maniacally devoted to archiving the entire Internet since 1996.

At the Archive's Wayback Machine, you can search through and relive an unbelievably rich swath of the Internet's history. It's one of the most valuable and widely cited collections in the world, and it's actually difficult now to imagine tolerating an Internet without it.

But if you visit Quora's /robots.txt (the Internet's standard for sending web crawlers a strongly worded letter) you'll see that they single out the Internet Archive for exclusion. The Internet Archive respects the robots.txt standard, and so there is no Wayback Machine content for Quora.

Quora recognizes that this is significant enough to merit an explanation in their robots.txt.

 # People share a lot of sensitive material on Quora - controversial political
 # views, workplace gossip and compensation, and negative opinions held of
 # companies. Over many years, as they change jobs or change their views, it is
 # important that they can delete or anonymize their previously-written answers.
 # 
 # We opt out of the wayback machine because inclusion would allow people to
 # discover the identity of authors who had written sensitive answers publicly and
 # later had made them anonymous, and because it would prevent authors from being
 # able to remove their content from the internet if they change their mind about
 # publishing it. As far as we can tell, there is no way for sites to selectively
 # programmatically remove content from the archive and so this is the only way
 # for us to protect writers. If they open up an API where we can remove content
 # from the archive when authors remove it from Quora, but leave the rest of the
 # content archived, we would be happy to opt back in. See the page here:
 # 
 # https://archive.org/about/exclude.php
 # 
 # Meanwhile, if you are looking for an older version of any content on Quora, we
 # have full edit history tracked and accessible in product (with the exception of
 # content that has been removed by the author). You can generally access this by
 # clicking on timestamps, or by appending "/log" to the URL of any content page.
 # 
 # For any questions or feedback about this please email robotstxt@quora.com.

 User-agent: ia_archiver
 Disallow: /

(Update: After publication of this piece, Quora added the first paragraph above. I've updated the excerpt above to match their current robots.txt.)

So, Quora's rationale for blocking the Internet Archive is that Quora can't go back and automatically rewrite history whenever one of its users wants to.

Bear in mind, you can already remove individual pages (in their entirety) from the Internet Archive by adding them to your robots.txt. Quora is asking for the ability to excise specific content from already archived pages.

Can you imagine if the Archive implemented such a feature? It would make the Archive's historical record completely untrustworthy, and destroy its credibility. You can bet many people, companies, and governments would love the ability to go in and selectively excise or modify the historical record of their work.

But that's not how history works, and it's definitely not how the Internet works. Quora is not a private communications network. When users contribute to Quora, they're participating in Quora's mission: to "share and grow the world's knowledge". Like publishing a book, making a TV show, or any other form of human broadcasting: once it's out there, you can shape its use, but you don't get to withdraw it from the public record.

The Library of Alexandria.

What Quora is asking for from the Internet Archive — and really, since the Archive has no public competition, from the Internet — is unreasonable, short-sighted, and selfish. Quora is simply being a shark about "their" content, at the public's expense.

I usually try to be generous about people's motives, but I'm comfortable assuming the worst here. Quora's reason is simply too flimsy, and its business incentives too tangled up in the outcome, for their comment above to be the full story.

Quora is a free service built on venture capital that will need to monetize its users over the next couple years, and wouldn't you know, they really want you to visit quora.com, and they really want you to create an account.

In fact, until very recently, Quora would block visitors from seeing more than the first answer unless they logged in. I'm willing to bet most people reading this have run into that popup before.

They've taken down the popup, but all that content is still entirely under Quora's control. If Quora was to collapse next year, it's completely unclear what would happen to the human output they've collected. This is not theoretical: unilateral mass destruction of user-generated content happens all the time.

Quora's question-and-answer system is a generalization of the extraordinarily successful model of Stack Overflow, a Q&A site for coders that quickly became the world's free google-able university and teaching assistant for anyone working in technology.

It's not an understatement to say that Stack Overflow has completely changed how and how fast software developers get their work done today. If you were to chart the days I visit Stack Overflow, it would look awfully similar to the chart on my GitHub profile. Stack Overflow has since expanded into the Stack Exchange network, and in many ways is a direct competitor to Quora.

Like Quora, Stack Overflow is privately run, and like Quora, Stack Overflow depends on a community of active users to generate activity, knowledge, and revenue.

Unlike Quora, Stack Overflow has no problem being archived back to Day 1 of its existence — presumably because the founders understand how the Internet works and understand what it actually means to grow the world's knowledge.

Until Quora understands this, I'll be contributing my knowledge to the world, and not to Quora.

The Internet Archive's ceramic archivists. Photo by Tom Foremski of Silicon Valley Watcher.

  1. female toys

    Mais vous avez besoin d'une bonne connaissance du poker, de logicielsde poker et de bons bonus pour vous aider à réussir.

  2. Lynell

    It's an amazing piece of writing designed for all the online users; they will take benefit from it I am sure.

  3. sj

    Me too. I have blocked quora from showing in my search result. I wish they would opt of on google search so it would not show up on any search result.

  4. Eric Mill

    > I can understand you personally wanting to avoid placing your own content on sites that have a no-archive policy, but what you're saying here is effectively that no-one should be able to place content on no-archive sites.

    That's not my view, nor what I'm arguing. I am arguing that Quora, as an immense human knowledge base, has an obligation to find a better way to navigate these tradeoffs than just disabling all archiving for everyone.

    > Remember Quora is not asking anything of the IA, they're not lobbying them to change their policy, they've just taken a particular view that they want to opt out.

    They are asking something of IA -- that's made clear in their robots.txt. They want IA to provide an API for Quora to automatically update content to reflect individual Quora users' desires to remove their content or identification.

    > They've made this clear to their user base, which is also not required to use Quora if they don't want to.

    As far as I know, they've not made this clear to their user base (which does not read robots.txt). I suspect the Quora user base has widely varying assumptions over what happens to their contributions.

    > You are free to make whatever decisions you like. I'm afraid most people won't acknowledge your right to make their decisions for them.

    We're talking about Quora's defaults, here. Individuals still have recourse to remove content from the Archive. And even if Quora restricts the Archive, others who ignore robots.txt can grab everything just fine.

    That's what makes Quora different than Snapchat, or Whisper. Posting an answer on Quora is an act of publishing. Like all other forms of publishing: authors don't lose copyright, and don't lose recourse, but they do lose a bit of control and can't fully withdraw their work from the public record.

    That's been the case since before the Internet. Now that we have a massive and lively Internet, its Archive is all the more necessary.

  5. Andrew Betts

    I can understand you personally wanting to avoid placing your own content on sites that have a no-archive policy, but what you're saying here is effectively that no-one should be able to place content on no-archive sites.

    Remember Quora is not asking anything of the IA, they're not lobbying them to change their policy, they've just taken a particular view that they want to opt out. They've made this clear to their user base, which is also not required to use Quora if they don't want to. If you feel so strongly you could contribute to other sites that are pro-archiving, like Wikipedia or Stackexchange, of which there are many available.

    Some sites have built their entire value proposition on non-archiving of user content, like Snapchat.

    I think the very first post on this comment thread was pretty insightful. You are free to make whatever decisions you like. I'm afraid most people won't acknowledge your right to make their decisions for them.

  6. Eric Mill

    > And who are you to decide that the mission comes with me losing the right to remove, edit or "anonymize" what I write on Quora. That decision is for the Quora team to make.

    That decision is for you to make, and other writers to make, not for Quora. You retain your copyright when contributing to Quora, and your ability to request that material you wrote be taken down from the Archive and anywhere else it may appear.

    The Archive is certainly not the only entity that might try to archive Quora, but this robots.txt rule singles out the Archive. The Archive obeys robots.txt, and is an approachable entity that is responsive to takedown requests.

    In other words, the issue here is not about whether or not people have a right to privacy, or to be forgotten. The issue here is what can be automated, and what the defaults should be.

    Currently, Quora allows no automation, and has no apparent way for users to opt-in to automatically having their content preserved outside of Quora. And so right now, this massive knowledge base is not being publicly preserved, in case an individual user changes their mind about their comment or their identification after publishing.

    In my opinion, that tradeoff is totally unacceptable. Individual users have recourse to take down content, and Quora should not make a blanket decision on every user's behalf.

  7. Anubhav

    "I have a Quora account, but I've almost never used it." That explains why you do not understand the privacy needs of users on Quora.

    You're quick at making snap judgments based on your biases and shortsightedness. Please grow your understanding and widen your perspective about the issue before you publish such posts.

    "When users contribute to Quora, they're participating in Quora's mission: to "share and grow the world's knowledge"."

    And who are you to decide that the mission comes with me losing the right to remove, edit or "anonymize" what I write on Quora. That decision is for the Quora team to make. As a long time quora user, I've made one or two alterations that could otherwise have cost me my job and maybe a little more.I'm glad the team has made such considerations to respect my privacy and that they feel it's more important for me to able to retain my job than for someone to find an accurate snapshot of how the early 18F team felt about their job and their work.

  8. me

    You all are kidding, right? This is all the more reason to use Quora. I feel my privacy is that much more protected now that I understand this. Quora isn't bad. It's fantastic. My daughter used it for help with Latin homework. I ask (or just read) about things I somehow "missed" in high school. I learn a lot on it.

  9. Eric

    > We're not sure WHY Quora blocked the IA.

    @Kevin - they have a lengthy note in their robots.txt about it, which I cover in this post. And one of their execs left a comment further down in this thread saying the same thing (and an identical comment on Hacker News).

  10. Kevin Burton

    My career involves web crawling. I'm the founder and CEO of Spinn3r and we've been crawling our customers for 7 years now.

    Long story short, these issues are far more complicated than just "X is blocking Y so X must hate Y" ... unless you have evidence.

    The only thing we know for certain is that Quora blocked the IA.

    We're not sure WHY Quora blocked the IA.

    There are many legitimate reasons why Quora could have blocked the IA, but if so, they should resolve them and unblock the IA.

    1. There could be a URL that responds to GET that actually deletes user information. This is a BAD idea but people still do it. A crawler could trigger it.

    2. The IA may have accidentally abused their resources and requested too often. Unlikely but still possible.

    > Not everything that is ever said on the internet needs to be permanent. We deserve the right to be forgotten.

    Maybe. However , your rights end where my rights begin. I have the right to remember. You don't have the right to force someone to delete their content.

  11. Nemo

    Quora is amazing in that it's a smarted up version of Yahoo! Answers and Facebook: less rubbish and slightly better terms of use. As long as it "eats" them (only in English?) and nothing else, it's an advancement; but to be a real force for good it would need several other things.

  12. Jon D

    Quora should just be wiped off all search engines and, preferably, the internet. Their signup-required to view an answer is nothing but a shady, scammy practice.

    That site should be killed off ASAP.

  13. Ben Marks

    @Marc Bodnick - That is a joke, right? This is one long, slow troll, isn't it? Tell you what. You can take a snapshot of this page, then later on I'll email Eric and ask him to anonymize my post, which I assume will somehow magically update your archive.

  14. Michael

    Why would Quora hand over it's most valuable assets before figuring out how to monetize the site? Can't blame them for that. What leaves a bitter taste in my mouth with Quora is that it's mob rule(thus the name), and decidedly liberal. Not being more neutral, it creates an uncomfortable place for people with dissenting ideas, as worthy or unworthy as they may be. As a result, the answers and comments can't really be taken as complete, and context still needs to be factored in by the user. That leaves a lot of subjectivity on a lot of matters.

  15. anony moose

    Archive.org harms many people. Not everything that is ever said on the internet needs to be permanent. We deserve the right to be forgotten.

  16. david

    I highly recommend Scott Hanselman's take on putting knowledge inside walled gardens: http://www.hanselman.com/blog/DoTheyDeserveTheGiftOfYourKeystrokes.aspx.

  17. Sujeet Pillai

    http://ideaforge.postach.io/quora-is-just-baaaad

  18. totty

    Just don't save the user's name in the wayback machine. When the bot is from wayback, make all users anonymous.

  19. Kin Lane

    Great post Eric. I am so with you. You've added another reason to my list of why I don't use Quora, and openly advocate others should not as well. (http://apivoice.com/2014/04/11/my-answer-to-why-you-should-not-use-services-like-quora-that-do-not-offer-api-and-data-portability/)

  20. Marc Bodnick

    The reason we opt of the wayback machine is because this decision lets writers change their mind whether to have an answer published, or change their mind whether to use their name in authoring an answer (i.e., vs. making it anonymous).

    People share a lot of sensitive material on Quora - controversial political views, workplace gossip and compensation, and negative opinions held of companies. Over many years, as they change jobs or change their views, it is important that they can delete or anonymize their previously-written answers.

    I know from first-hand experience that Quora writers sometimes decide to go anonymous after they've shared something sensitive. I do this myself from time to time, and I appreciate the option to make that change; this option gives me more comfort in sharing what I know about sensitive topics.

  21. nononono

    Here is another way StackExchange is archived: https://archive.org/details/stackexchange (database dumps)

  22. Pero

    I find Quora so anoying, that I blocked it from showing up in search engine results

  23. John

    So, don't use Quora?