Removing duplicate records in ORACLE SQL

mrcaption49

Pranav Bakare

Posted on October 25, 2024

Removing duplicate records in ORACLE SQL

Removing duplicate records from the emp table using various methods, including creating the table, inserting sample data, backing up the table, and employing different SQL techniques. This will cover all solutions discussed previously.

Step 1: Create the Employee Table

First, create the emp table to store employee information.

CREATE TABLE emp (
    emp_id NUMBER PRIMARY KEY,
    first_name VARCHAR2(50),
    last_name VARCHAR2(50),
    email VARCHAR2(100),
    department VARCHAR2(50),
    salary NUMBER
);
Enter fullscreen mode Exit fullscreen mode

Step 2: Insert Sample Data

Insert sample data into the emp table, which contains some duplicate records for demonstration purposes.

INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (1, 'John', 'Doe', 'john.doe@example.com', 'IT', 60000);
INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (2, 'Jane', 'Smith', 'jane.smith@example.com', 'HR', 65000);
INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (3, 'John', 'Doe', 'john.doe@example.com', 'IT', 60000); 
-- Duplicate
INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (4, 'Mark', 'Johnson', 'mark.j@example.com', 'IT', 70000);
INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (5, 'Jane', 'Smith', 'jane.smith@example.com', 'HR', 65000); 
-- Duplicate
INSERT INTO emp (emp_id, first_name, last_name, email, department, salary) 
VALUES (6, 'Emily', 'Davis', 'emily.d@example.com', 'Finance', 55000);
Enter fullscreen mode Exit fullscreen mode

Current State of the Table

After inserting the records, the emp table will look like this:

Step 3: Back Up the Table

  • Before removing duplicates, create a backup of the emp table to avoid any data loss.
CREATE TABLE emp_backup AS SELECT * FROM emp;
Enter fullscreen mode Exit fullscreen mode

Step 4: Remove Duplicate Records Using Different Methods

Method 1: Using Unique Identifier (Email and MAX Function)

Identify and select duplicates based on the email and keep the maximum emp_id.

-- Identify duplicates

SELECT email, MAX(emp_id) AS max_emp_id, COUNT(*) 
FROM emp 
GROUP BY email 
HAVING COUNT(*) > 1;

-- Delete duplicates
DELETE FROM emp 
WHERE emp_id NOT IN (
    SELECT MAX(emp_id) 
    FROM emp 
    GROUP BY email
);

Enter fullscreen mode Exit fullscreen mode

Method 2: Using Self Join

Remove duplicates using a self-join on the email column.

DELETE FROM emp e1
WHERE EXISTS (
    SELECT 1 
    FROM emp e2 
    WHERE e1.email = e2.email 
      AND e1.emp_id < e2.emp_id
);
Enter fullscreen mode Exit fullscreen mode

Method 3: Using Window Functions

Utilize window functions and an inline view to identify duplicates and delete them.

DELETE FROM emp 
WHERE emp_id IN (
    SELECT emp_id 
    FROM (
        SELECT emp_id, 
               ROW_NUMBER() OVER (PARTITION BY email ORDER BY emp_id) 
AS row_num 
        FROM emp
    ) 
    WHERE row_num > 1
);
Enter fullscreen mode Exit fullscreen mode

Method 4: Using MIN Function

Keep the minimum emp_id for each duplicate email and delete others.

DELETE FROM emp 
WHERE emp_id NOT IN (
    SELECT MIN(emp_id) 
    FROM emp 
    GROUP BY email
);
Enter fullscreen mode Exit fullscreen mode

Step 5: Verify the Table After Deletion

After executing the delete operations, you can verify the remaining records in the emp table to ensure that duplicates have been successfully removed.

SELECT * FROM emp;
Enter fullscreen mode Exit fullscreen mode

Step 6: Confirm Backup Data

You can also check the backup table to ensure that the original data is preserved.

SELECT * FROM emp_backup;
Enter fullscreen mode Exit fullscreen mode

Final State of the emp Table

After removing duplicates, the emp table should look like this:


Conclusion

This comprehensive guide demonstrates various methods for removing duplicate records from the emp table, including the creation of the table, insertion of sample data, backing up the table, and utilizing different SQL techniques. Always ensure to back up your data before performing deletion operations to avoid accidental data loss.

๐Ÿ’– ๐Ÿ’ช ๐Ÿ™… ๐Ÿšฉ
mrcaption49
Pranav Bakare

Posted on October 25, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related

ยฉ TheLazy.dev

About