|
Abstract:
|
Today's data is rarely stored in centralized location due to the enormous amount of information that needs to be stored and also to increase reliability , availability and performance of the system . Same data is stored in different format into different company's database as well as they may be partitioned or replicated . We consider various scenarios of distributed database such as horizontal , vertical fragmentation and attribute overlapping . Allowing access to integrated information from these multiple datasets can provide accurate and wholesome information to the end -user . We research on efficient querying to these distributed databases to get top k elements matching the ranking order provided by the user . We also discuss hierarchical way of using the top k algorithm and their limitations to our problem . We propose four different algorithms based on NRA algorithm to solve this problem efficiently and compare and contrast these methods . Once the combination of data sources has been identified , we use our algorithms to get the top elements from these data source combination , process them to get the top k elements according to the user's ranking function . |