The proposed architecture extends the DRUM technique proposed in the best paper of WWW 2008 titled "IRLbot: Scaling to 6 Billion Pages and Beyond": the technique is used for a single-machine Web crawler. In the thesis, I extend it for a parallel crawler.
Following is my Master's thesis defense presentation, which I successfully passed on 16th December, 2010.
The full-text of the thesis will be available soon. Interested students/researchers may contact me for any questions, comments or feedback. Any researcher interested in the domain of Web crawling may also contact me if he/she has any suggestions. The full-text of the thesis can also be requested via email.
can u mail me ur thesis
ReplyDeleteSure please give me your email id.
ReplyDelete212551703@qq.com,Please send me your thesis ..
DeleteThank you for sharing your thesis. Well, from the looks of it, you certainly did a great job construction the paper. And it is really a hard topic to tackle about the implementation of a scalable high-speed parallel web crawler, so it’s a plus that you really defended it.
ReplyDelete