I'm doing a research project to collect hotel info from two major travel websites Expedia and Travelocity. I'll need to match the same hotels from different websites into one record before I can do further discoveries. But it seems the spelling for some hotel names and addresses are different on these websites. For example:
Hotel name: Sheraton Vistana Villages Resort Villas, I-Drive/Orlando
Address: 12401 International Dr, Orlando, FL, 32821
Hotel name: Sheraton Vistana Villages Resort Villas I Drive Orlando
Address: 12401 International Drive Orlando, FL 32821
You can see this is clearly the same hotel but there is slight difference on both the hotel names and addresses. A simple matching algorithm with exact text string matching function will not be able to match these two records. I need a program to match these hotels based on a good reasoning process so to produce a well matched result for all these hotels. I believe you should have some experience for doing this type of matching work before. Please provide a sample result from your previous job related to my project. Thanks.