• # April 22, 2013 at 12:43 am

    Does anyone know of a way where the user would search on a web form and have it return results based on what was IN a PDF document? I can’t think of any way to make it show any kind of excerpt, but even if it could just provide that the file it returns contains the information the user had sought.

    # April 22, 2013 at 3:50 am

    Your question is hard to provide a solution for as it’s a bit vague. Are there structural guidelines to the content of these PDF’s? Where are the PDF’s coming from?

    This is a bit over my head but have you researched on how Google does it with their preview in search? When you search a keyword and hover over the arrow, it has a red border around, what I believe to be, a summary or excerpt. Unfortunately, the excerpt or summary does not always have the keyword in the text. So I’m thinking it must be a bit complex to do this requiring some sort of algorithm. Then again, this isn’t my area.

    Either way, with PHP you can extract content and/or post an excerpt from a PDF.

    # April 22, 2013 at 11:32 pm

    This may not be ideal but 75 PDF’s doesn’t sound like that much. Of course I don’t know the extent of the content but why don’t they just copy/paste and create a digital web archive?

    # April 23, 2013 at 3:21 am

    There might be simpler way to extract that content instead of doing it by hand. I’d suggest asking on Stack Overflow.

    # April 23, 2013 at 12:40 pm

    I *think* you can do this with Google’s search

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic.