« The Superhero Men Don't See: Evidence | Main | 2+2=? »

May 19, 2012

Comments

Is there a way for people to enter manuscripts into the data base? For example if someone wanted to run off a few copies of a book they ahd written.

@Laura -- Good idea. Should fit in nicely, if there's a decision to do it. The easy way for the store, and reasonable for being in an environment like Harvard: You submit PDF(tm), we turn it into paper books.

But some possible copyright problems would have to be solved, like making sure the operation isn't turned into a factory for pirated books.

Which reminds me: Lulu.com has the same problem. Also, Lulu is an established competitor in self-publishing, but it doesn't seem to compete much in the tiny-quantity business.

It would require the customer to be able to turn the MS into something printer-ready; hence the comment about Harvard, or any major university, as a place where you'd find customers who could do that.

BTW, has anyone noticed what a massively irritating pile of shit the current captcha is? Most of the words are simply not unambiguously readable -- oops, I mean arbitary character strings, because words would make it too easy -- and you don't get to make a couple of tries on a given string to resolve the ambiguities.

So, just keep hitting the continue key till you see something reasonable, and decode that -- I wonder how long till a Security Expert notices this outrageous way of cheating, and killfiles your IP address if you try more than 3 times.

While I'm up, and not wanting to waste an iteration of this nasty process by separating posts, what I wonder about the process is how true the claim is about the product being exactly like a paperback. By which I mean a paperback of good quality, which I think is the claim.

If you print from a real genuine PDF, formatted from text plus markup, then the output is as good as the printer and paper allow. I think. And is that really the same as the page that a publisher will reproduce by photo-offset or something?

But printing from a Google scan of a book? These are optical scans from printed books, and are they at high enough resolution? I have the impression that they are not, but I don't know.

So here's the question: if you print that Google file on the best laser printer, using paper that's good enough for the printer's best efforts; and a good paperback publisher phototypesets the text and prints it for sale to bookstores: can you tell the difference with the naked eye?

Or does the equivalence apply only to a cheap mystery paperback printed on crummy paper from a photocopy of a printed book?

GAAACK.
"what I wonder about the *printing* process"
Not the process of getting the captcha solved, which I have doomed myself to going through once more.

Porlock,
evidently, the older captcha formats have been broken. Also, by doing reCAPTCHA, you are helping Google do OCR. Given that one of the words is not known, it seems that you would only need to get one of the words correct, as long as the attempt at the non control word is close, so you might want to give it a whirl rather than cycling thru multiple reCAPTCHA challenges.

And, as if by magic, the reCAPTCHA code this time presented a house number to be deciphered. That's the first one I've had, but it had already caused a row in the UK.

NB: the Harvard Book Store isn't the Harvard bookstore, in the usual sense of a university bookstore; that's the Coop, which is now associated with Barnes and Noble.

Harvard Book Store is an independent bookstore up the street a bit. It's a great place.

Speaking of the current captcha, for some reason I haven't been able to sign in with it for some time; the sign-in just hangs when I click "Post". I can sign in with my Twitter account, which is why my name is written funny.

The reCaptcha is a pain. I understand the virtue in helping improve OCR for reading old and hard to decipler documents. But most of the offerings lately don't look like that. They just look like deliberately messed up character strings.

Either we ought to be using real examples of documents that OCR is struggling with, or we should go to something a lot simpler. The strings that The Economists (occasionally) uses come to mind.

Whenever I come across one of those incomprehensible reCaptcha things I close my browser tab in disgust. Let some spammer figure out a way to decipher all that jumbled mess, I’ve got better things to do with my time. Also if a genius hacker could somehow figure out a way for a computer to automatically to read those indecipherable reCaptcha strings, more power to him – he just revolutionized the world’s book digitizing industry.

Harvard Book Store is an independent bookstore up the street a bit. It's a great place.

Second.

It really is a great place. Huge selection, amazingly knowledgeable staff, no coffee. Too many Cambridge bookstores have gone under. Really hope this one survives.

Since it's pretty slow around here, and since this is an open thread, I thought I'd link to something short and sweet I read this morning regarding a subject that proved controversial (not abortion-controversial) previously, sparking lots of interesting conversation.

The comments to this entry are closed.

Blog powered by Typepad