Auto Increment vs Manual Table ID

When developing a structure for your database, there comes the question of how to pick a unique ID for your tables. In my experience, you have about 3 good options, which I would outline below.

1. Auto Increment

Auto increment is usually a good option, especially when you have a table from which data may be deleted from time to time. It’s more reliable than the next option on this list. In the last year or so, I have used auto increment because I had a problem with deleted data when I used option #3 below. Another problem with this is that the size of the ID’s will increase, depending on the growth rate of the table data.

My advice: if you’re gonna delete data from your table from time to time, try using auto increment. Also, if you have a table who’s data doesn’t increase drastically over a short time, this might also be a good option.

2. Manual (Generated) ID

Before I started using auto increment, I usually generated my unique table ID’s using a simple PHP function that guarantees no repeats or conflicts even when data was deleted. It also guarantees a fix length for the ID field, which in turn limits the range of numbers that can be used for IDs (eg, 4 digits, or 6 digits). Nevertheless, this method of mine can generate up to 8 digits uniquely, if needed. Until now, about 6 digits have met my night as my tables are not huge.

My advice: if you want a fixed length for you table’s unique ID field, a function to generate ID’s would be a good choice. Usually, I just select the last 6 digits of the number of seconds passed since the Unix Epoch or simply…

date("U");

3. Manual (Counted) ID

When I started in PHP, one of the methods I saw people use was to count the number of rows in the table, then increment by 1 to generate the next unique ID. For a while now, I haven’t seen this method used anywhere recently. This method also has the same problem as in method #1 above. If you have 2 items in the table, the next generated ID would be 3. So now you have 3 times. If you delete 1 item, the next could would be 3 and that would conflict with the already existing ‘3’ in the table. This method requires more coding to count, and if you want to check that the newly generated ID doesn’t exist in the table, you’d require more coding again lol

My advice: if you aren’t going to delete from the table, you may stick with it. However, I do not recommend this method. Stick with method #1 or #2 above, accordingly.

Auto Increment
Auto Increment

If you have other ways you assign unique ID’s to your database table, you may share in the comment section below (if you don’t mind)

Advertisements

2 thoughts on “Auto Increment vs Manual Table ID

  1. thenewphalls February 25, 2014 / 14:40

    Method 3 is a bad idea for reasons you already mentioned, but there are a number of issues with method 2.

    With certain storage engines such as InnoDB, it’s recommended to use sequential primary keys because the DBMS will order the rows on disc according to the order of their primary keys. If you’re creating a new row with a primary key value between two existing rows, those rows must be rearranged on insert. Therefore, you need to take care that your manually-generated IDs are sequential (if you’re doing this, you might as well just leave it to the DBMS to auto-increment).

    “I just select the last 6 digits of the number of seconds passed since the Unix Epoch”

    IDs should always be unique, but this method in particular will create a lot of conflicts as the last 6 digits of a timestamp will roll over.

    Limiting your ID to a certain number length will also restrict the range of possible IDs i.e. the number of rows that can be created before you have problems creating new IDs. It may be better to use a different field for the fixed-length number – and leave the ID as just a row identifier.

    In my experience, method #1 is the best option, all things considered.

    Like

    • Kheme February 25, 2014 / 16:09

      Thank you for your comment, but let’s take a look at the issues you raised here…

      Method #2 generates numbers according to the number of seconds passed since January 1 1970 00:00:00 GMT and naturally, each number generated would be at least 1 second apart. In the case where you have bulk insert, then there would be a problem here. For instance, 20 insert statements executed within half a second. However, all the times I used this function, I never seemed to have this problem. Still this is an issue, but sequential values is definitely not a problem.

      Next, the date(“U”) function returns a number like 1393336794 which is 10 digits. Normally, I prefer things shorter, so I pick the last 6 digits which gives 151,200 possible different combinations (If you choose to use all 10 digits, then you have 3,628,800 possible different combinations). However, since values are not inserted every second, then that reduces the number of unique values guaranteed without conflict. So as you have pointed out, this is an issue but might still be useful in tables with say small numbers of entries (like 1,000 or there about).

      Lastly, I usually prefer having a fixed length for the ID field. Also, assigning length of 5 to a field (any field) who’s contents’ lengths never go above 2 is considered waste of allocated space.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s