Vol. 1 Issue 5 - February 1, 2002 - Using Google's Document Index

In this issue... 

  • Other Google Document-Type Search Examples
  • Word, RTF, Excel Document Searches
  • Finding Organization Charts Online with Google
OTHER RESUME DOCUMENT-TYPE FIELD SEARCH EXAMPLES

Google's amazing new document-type index feature opens up a gold mine of data previously impossible for CyberSleuths to search. Adobe PDFs are not the only previously untapped resources revealing documents. PDF files are only the most common. After PDFs, the next most common file types are PostScript and MS Office. PostScript files are created when you send something to print. Frequently they are stored in this format. MS Office documents include not just the obvious DOC but also the common RTF or Rich Text Format. Plain text documents with the file extension TXT are the third most common file types. To illustrate, let's try the previous issue's Account Manager example, but this time look for DOC instead of PDF like this:

From Google enter the following search string:
filetype:DOC "Account Manager" "Nortel Networks"

Note the filetype command, like everything else at Google, is not case- sensitive. There are about 300 MS Word results, many of which are good resumes. Adding another skills keyword like "optical" narrows down the results to a more manageable amount: between 80 and 100. Do you have the application required to view this file type? In this case you need to have MS Office or at least MS Word. If you don't, simply click on the "View as HTML" link provided for each result to view Google's translation into a simple version viewable in your browser. But that's not all. There are other file types people use for resumes. Here is one more search string using the skills keyword "SONET" (Synchronous Optical Network.)

Enter this search string in Google:
SONET resume (filetype:pdf OR filetype:ps OR filetype:doc OR filetype:txt OR filetype:rtf)

There must be hundreds of unexplored SONET resumes among these 1,000 plus results. Every other search performed on the Net will bring back mostly static HTML pages and overlook this wealth of documents. A thousand results are too many, so with a little fine-tuning you can reduce this list to a few dozen very relevant resumes. Adding the term "CCIE" we find all the SONET-savvy prospects like this:

From Google enter the following search string:
SONET CCIE resume (filetype:pdf OR filetype:ps OR filetype:doc OR filetype:txt OR filetype:rtf)

The fun doesn't end here. Other types of documents contain very useful information. For example, how many people use Excel spreadsheets to organize data like sales leads and contact information? Try this search string in Google:

"Lucent Technologies" phone filetype:xls

This search yields some amazing results, such as the following:

http://www.atis.org/pub/clc/niif/nia/22niapar.xls

ATIS is the Alliance for Telecommunications Industry Solutions. The above link is to an Excel "sign-in sheet" containing Name, Company, Work Phone and E-mail address for attendees of Network Interconnection Interoperability Forum's (formerly Network Interconnection Architecture Committee) February 2nd, 2000 meeting. Very useful, don't you think?

What other document-types do you know of that contain useful information?

FINDING ORGANIZATION CHARTS ONLINE WITH GOOGLE

Have you ever wondered, "What happens with all those presentations my manager makes?" Enough of them end up on the Web that it's worth learning how to find them. It's a bonanza of information not mined by your normal Internet search. Searching for documents of the type .ppt we can find MS PowerPoint files which have been stored on websites for reference. Many times these documents are uploaded to a website for just a few uses, but they end up staying online indefinitely. This is even more common at graduate business schools where people create slides containing their employer's organizational structure for classroom presentation assignments.

Besides being a common graduate-level homework assignment, PowerPoint slides contain all kinds of useful information from product to process. It's the famous organization chart that reveals the best recruitment data, however. Many executives who give presentations describe their organization using PowerPoint's simple org. chart wizard.

Skeptical? Would you like to know what the Manufacturing and Logistics organization looks like at Cisco Systems?

Enter the following search string in Google:
filetype:ppt cisco chart manufacturing logistics (org OR organization)

One of the few results on this page is a PowerPoint presentation called "M&A The Cisco Case" archived at U Penn's Wharton Management School. Professor Johannes Pennings has his own folder there. His folder is http://www-management.Wharton.upenn.edu. There he lists all the overhead slides from his courses. What you can't see from here are the ones in a folder called "CustomizedCourses." You can't even browse that folder because directory listing is denied. However, Google was able to index the link. Look at slide #18 of this presentation:

http://www-management.wharton.upenn.edu/penningscourse/CustomizedCourses/CiscoM&A(2).ppt

Still don't believe it works? Say you want to find contacts in the Network Reliability industry. Let's construct the search. First, indicate you want only MS PowerPoint documents by using the command "filetype:" and the file extension "ppt" just like you did for the pdf and doc examples before. Now simply add your keyword "network reliability" in quotations, to indicate you want results with those two words together. Finally adding the word "chart" and modifying it with the terms "(org OR organization)" will result in retrieving documents mentioning either "org chart" or "organization chart."

Enter your final search using the following search string in Google:
filetype:ppt "network reliability" chart (org OR organization)

This search produces few, yet accurate results. There should be one or two documents named "Test of New Master" at a website that starts with www.nric.org. NRIC is the Network Reliability and Interoperability Council. The PowerPoint presentations listed here are archives of their meetings. One meeting took place in June and the other in October of 2001. Pay special attention to slides 3 and 5 of these presentations. Not only will you find direct contact information for key contacts at Lucent, AT&T and Telcordia, but also a list of other people in this field. Visit this link:

http://www.nric.org/meetings/docs/FG2_B1_Council_Update.ppt

There are many other ways to use this "filetype:" command, so be free to experiment. Try, for example, combining it with either the "intitle:" or "inurl:" commands to find presentations named after competitors or skills keywords. Or, how about combining the "filetype:" command with the "site:" command? Is there an organization your candidates belong to like NRIC mentioned above? A search using "filetype:ppt" and "site:www.nric.org" reveals over 50 other presentations at NRIC.org some of which reveal different names. What if you used "doc" instead of "ppt"? If you search for "filetype:xls" and "site:www.nric.org" in Google you find the following:

http://www.nric.org/fg/fg2/sc1/focusgroup2.xls

It's a spreadsheet list of 30 NRIC names including employer, address, direct phone, and e-mail. If you need more ideas to get you started just give us a call at 877.293.3541.