Searching the text within a PDF? - CSS-Tricks

This topic is empty.

Viewing 5 posts - 1 through 5 (of 5 total)

Author

Posts
April 22, 2013 at 12:43 am #44288

JoshWhite
Member

Does anyone know of a way where the user would search on a web form and have it return results based on what was IN a PDF document? I can’t think of any way to make it show any kind of excerpt, but even if it could just provide that the file it returns contains the information the user had sought.

April 22, 2013 at 3:50 am #132673

chrisburton
Participant

Your question is hard to provide a solution for as it’s a bit vague. Are there structural guidelines to the content of these PDF’s? Where are the PDF’s coming from?

This is a bit over my head but have you researched on how Google does it with their preview in search? When you search a keyword and hover over the arrow, it has a red border around, what I believe to be, a summary or excerpt. Unfortunately, the excerpt or summary does not always have the keyword in the text. So I’m thinking it must be a bit complex to do this requiring some sort of algorithm. Then again, this isn’t my area.

Either way, with PHP you can extract content and/or post an excerpt from a PDF.

April 22, 2013 at 11:32 pm #132797

chrisburton
Participant

This may not be ideal but 75 PDF’s doesn’t sound like that much. Of course I don’t know the extent of the content but why don’t they just copy/paste and create a digital web archive?

April 23, 2013 at 3:21 am #132814

chrisburton
Participant

There might be simpler way to extract that content instead of doing it by hand. I’d suggest asking on Stack Overflow.

April 23, 2013 at 12:40 pm #132777

TheDoc
Member

I *think* you can do this with Google’s search https://developers.google.com/custom-search/v1/overview
Author

Posts

Viewing 5 posts - 1 through 5 (of 5 total)

The forum ‘Other’ is closed to new topics and replies.