Description: When foreign keys point to unexisting parent records, or when parent records exist for which no child (if needed) exist, it can be a heavy task on the database to find such records and correct the situation. Outer joins can stress your database when the tables contain millions of records. It happens that databases run out of temporary resources to complete the queries, leaving you with a lot of hassle. With this project we try to retrieve the foreign key links info into memory of a client, make the necessary 'joins' to detect the records we need to delete or correct. Once these records are identified an sql for each record can be generated.
-note: With proper contstraints your database should never come in this state. However, with large datainserts often foreign key constraints are temporary disabled
-note 2: this can be used to clean records even if they are not part of missing links.
-Approach: Select the id's (== primary key) of all the records of one table, then select the id's and fk field of all the records of the second table. All these small records are kept in memory. Next you search in memory which child records have no parent (or vice versa) and remember their id's. These ideas are used to create delete statements for the records, which are run directly on the database or dumped in a file.
-Future extensions: (for future ppp's) limit the selection to part of the tables instead of the full table. config file, command line options gui interface more than two tables and multi foreign key links.
-User Stories (in random order)
- select the primary key field of all records in a table, store in memory
- select the primary key and one foreing key field of all records in a table, store in memory
- same as 2 with +1 fk field
- create a memory table
- optimize: table with known size
- add record to a memory table
- search in memory for childs with no parent (operation missing parent)
- search in memory for parents with no childs (operation missing child)
- generate delete queries for the records and execute on the database
- generate delete queries and dump the sql to file
- main class with commandline parameters (properties) with config file
- gui of cli to select table, select field(s), select operation (missing parent of child),
- show number of records to delete, confirm and delete
- gui, awt, swt, swing, ...
- ...
Purpose is to have an easy project to explore pairprogramming and some fun along the way.