GrAmazon’s Disastrous Data Import + Foreign Languages = Clusterf*ck

Sarah, GR’s data “Import Master” *snort*, has stated the following:

And, to give some context to what’s going on here:

Over the coming year we’re hoping to expand Goodreads into other countries and make it more accessible to users that speak languages other than English. While we have a large and incredible army of librarians, that army won’t necessarily scale to meet the needs of users all over the world as we begin to fully expand our library to encompass book editions for more and more countries.

Furthermore, even if we could inspire the help of volunteers around the world as quickly as we’ll need them, it wouldn’t be the best use of your/their time to ask you/them to manually enter data for every book that exists in the world.

Right now our goal is to use the information we’ve gathered from this first data import, along with the insights that GR Librarians have shared with us to calibrate our system for clean, non-intrusive imports that will scale as our library grows rapidly over the coming year.

I want to reiterate that we know that you guys do an incredible amount of really really amazing work. Our mission here isn’t to undo or undermine what you’re doing – it’s hopefully to develop a system that will support what you do and make your job easier (even if that’s not how it seems while we feel our way through these initial, somewhat tangled steps.) [screenshot]

Short version: Sarah drops bomb in first paragraph, spends three reassuring the masses, and runs off hoping her fate is not Kara’s:

Michael echoes my thoughts exactly:

This will not scale indeed. Considering the various issues with data imports, multiple language support, multiple character set support, stray editions in need to be combined, duplicated authors, multiple ISBNs without formalized storage thereof, and a data model in need of some revision – to just pump more data into this system will leave it beyond repair, and, I agree, no amount of volunteers will be able to save it. Unless you tackle the architectural/software issues first.

Good Luck. [screenshot]

Reading that entire thread will give you a headache from facepalming, headdesking and bodyflooring at the mess left behind the Amazon imports. Makes me want to revise my 2014 Goodreads predictions.

12 thoughts on “GrAmazon’s Disastrous Data Import + Foreign Languages = Clusterf*ck

  1. Translation: we don’t care how bad the goodreads database gets so long as data matching amazon products for sale is there. Even if matching goodreads data to product data means existing book data (including isbn, book covers, author name spellings, etc.) carefully curated for years by several hundred human eyeballs gets trashed and editions are never combined.

    So after this first data feed, we’ll take the,, amazon,DP, etc. site data on products for sale so that those countries can also integrate goodreads onto kindles and increase amazon sales; again, too bad if that means the goodreads book data gets destroyed so long as the product for sale shows.

    No, we have no interest in baiting until foreign language diacritics, author spellings, etc. get straightened out on the current data feed.

    No, sorry, no interest in giving a crap hiw the auto data feed will combine editions for countries where single volume English works are split into multiple volume foreign edition works.

    Here comes,, etc. (Which actually I would not mind if human interpreters were hired to translate well and if done AFTER making the regular site more in compliance with current disability standards; definitely on,Ÿ after the current butchering of the goodreads database was at least band aided before adding more …).

    Too bad they need those foreign country kindle sales.


    1. I noted your frustration over there, Debbie. I can’t see how butchering the database will sell more Amazon books. If anything, it will sell less.

      People use GR for the database, cataloguing, reviews and groups. Amazon seem determined to systematically destroy every one of those previously attractive aspects. I think they’re hollowing out the site, gutting it, and when everyone abandons it they’ll wonder why they left.


    1. *chuckles*

      I’ve not edited anything since September. I can’t wait for the time when GR announce the request for help from the librarian community for clean-up duty.



    2. I hope, hope fervently, that no one answers the call. GRs has shown that it doesn’t value the work previously done by the librarians and now that Amazon is the parent company, let them hire librarians.


  2. So did Kara get the blame for the entire meltdown in September? It doe snot make me feel assured when their own get slaughtered in this massive black hole of stupidity.


    1. I don’t know. I wouldn’t be surprised if she quit, either because she didn’t enjoy being thrown to the wolves or she could see where things were heading and didn’t want to deal with the fallout.


  3. It would be interesting to know whether Kara was really to blame for this mess or was just the scapegoat they needed after things went so badly. After all she just might have been executing marching orders that came down from others. She might have even objected to it, leading to her ousting. I agree she was certainly responsible for the terrible notification but even she might have been taken by surprise about what was going on and had to scramble to catch up with it. It could not have been her decision to start deleting, it wouldn’t be part of customer care.


