Monday, December 29, 2008
Deep Web Research 2009
Marcus P. Zillman on his Deep Web Research 2009: 'Bots, Blogs and News Aggregators is a keynote presentation that I have been delivering over the last several years, and much of my information comes from the extensive research that I have completed into the "invisible" or what I like to call the "deep" web. The Deep Web covers somewhere in the vicinity of 1 trillion pages of information located through the World Wide Web in various files and formats that the current search engines on the Internet either cannot find or have difficulty accessing. Search engines find about 20 billion pages at the time of this publication. In the last several years, some of the more comprehensive search engines have written algorithms to search the deeper portions of the world wide web by attempting to find files such as .pdf, .doc, .xls, ppt, .ps, and others. These files are predominately used by businesses to communicate information within their organization, or to disseminate information to external communities. Searching for this information using deeper search techniques and the latest algorithms allows researchers to obtain a vast amount of corporate information that was previously unavailable or inaccessible. Research has also shown that even deeper information can be obtained from these files by searching and accessing the "properties" information on these files. This guide is designed to provide a wide range of resources to better understand the history of deep web research. It also includes various classified resources that allow you to search through the currently available web to find key sources of information located via an understanding of how to search the "deep web"