A number of resources are available on the internet, people are habitual of using them without proper citation of the original source or without giving credit to the author.The content directly taken from pre-published sources is called plagiarized text. Text Comparison is needed to check unauthorized/ illegal usage of views, ideas & publications.So there is a need to find a suitable technique to find similarity between two documents.There are many text matching mechanisms such as Levenshtein’s Edit Distance, Cosine Similarity measure, Jaccard Similarity Coefficient, N gram, Hamming Distance, Scam Algorithm, Finger Printing, Substring Matching etc. But all these techniques have disadvantages like:
The chosen techniques are primitive, moderate and advanced respectively. So, the aim here is to enhance the aforesaid algorithms in terms of similarity index, time and to provide graphical comparison reports.
Internships closed for this project.
Ms. Shama ME Scholar, CSE