Identity column: identity, indexes, datetime

Hi,
We have an largish audit table with many columns. There are
inserts/deletes only on this table, no updates. Its never been indexed
or had a pk. Reports are run against this table, usually on date ranges
(datetime col) with joins to other tables on userid etc.
Recently an identity column was added, to improve performance. For start
I'm wondering would this do anything at all for performance? Presumably
the table is sorted by this field in the absence of any other pk/index?
If so what benefit is this if the column is never used?
Surely the best strategy would be to put a clustered index on the
datetime field and not bother with the identity?
Is there much difference in time overhead adding a clustered,
non-clustered or identity field to existing tables with many rows (this
database is used by many client and the time taken for the schema update
is a concern).
Thanks
JB.jb
Well, choosing clustered index may not be so easy , you will have to
investigate many things. So, choosing "right" column/s will speed up the
query/is and improve perfomance ...
> Is there much difference in time overhead adding a clustered,
> non-clustered or identity field to existing tables with many rows (this
> database is used by many client and the time taken for the schema update
> is a concern).
Take a look at CREATE INDEX... WITH DROP EXISTSING option in the BOL
"jb" <b@.b.com> wrote in message
news:OoI2jPCfHHA.4636@.TK2MSFTNGP03.phx.gbl...
> Hi,
> We have an largish audit table with many columns. There are
> inserts/deletes only on this table, no updates. Its never been indexed or
> had a pk. Reports are run against this table, usually on date ranges
> (datetime col) with joins to other tables on userid etc.
> Recently an identity column was added, to improve performance. For start
> I'm wondering would this do anything at all for performance? Presumably
> the table is sorted by this field in the absence of any other pk/index? If
> so what benefit is this if the column is never used?
> Surely the best strategy would be to put a clustered index on the datetime
> field and not bother with the identity?
> Is there much difference in time overhead adding a clustered,
> non-clustered or identity field to existing tables with many rows (this
> database is used by many client and the time taken for the schema update
> is a concern).
> Thanks
> JB.|||Sory, should be DROP_EXISTING
"Uri Dimant" <urid@.iscar.co.il> wrote in message
news:O62QTeCfHHA.3956@.TK2MSFTNGP03.phx.gbl...
> jb
> Well, choosing clustered index may not be so easy , you will have to
> investigate many things. So, choosing "right" column/s will speed up the
> query/is and improve perfomance ...
>> Is there much difference in time overhead adding a clustered,
>> non-clustered or identity field to existing tables with many rows (this
>> database is used by many client and the time taken for the schema update
>> is a concern).
>
> Take a look at CREATE INDEX... WITH DROP EXISTSING option in the BOL
>
>
>
> "jb" <b@.b.com> wrote in message
> news:OoI2jPCfHHA.4636@.TK2MSFTNGP03.phx.gbl...
>> Hi,
>> We have an largish audit table with many columns. There are
>> inserts/deletes only on this table, no updates. Its never been indexed or
>> had a pk. Reports are run against this table, usually on date ranges
>> (datetime col) with joins to other tables on userid etc.
>> Recently an identity column was added, to improve performance. For start
>> I'm wondering would this do anything at all for performance? Presumably
>> the table is sorted by this field in the absence of any other pk/index?
>> If so what benefit is this if the column is never used?
>> Surely the best strategy would be to put a clustered index on the
>> datetime field and not bother with the identity?
>> Is there much difference in time overhead adding a clustered,
>> non-clustered or identity field to existing tables with many rows (this
>> database is used by many client and the time taken for the schema update
>> is a concern).
>> Thanks
>> JB.
>|||> Recently an identity column was added, to improve performance. For start I'm wondering would this
> do anything at all for performance?
No. Well, it will increase row size a little so it will decrease performance slightly.
> Presumably the table is sorted by this field in the absence of any other pk/index?
Nope. Identity is a logical construct, and will have no bearing on physical structure.
> If so what benefit is this if the column is never used?
None, from a performance standpoint. From a logical standpoint, we get into the discussion of
identifying rows, natural vs. surrogate keys, but that is another topic.
> Surely the best strategy would be to put a clustered index on the datetime field and not bother
> with the identity?
That seems like the natural thing to do. However, another candidate is to cluster over the join
column(s). Do some testing what will give most gain. Whever you cluster on, you probably want to
have non-clustered index(es) on the other(s).
> Is there much difference in time overhead adding a clustered, non-clustered or identity field to
> existing tables with many rows (this database is used by many client and the time taken for the
> schema update is a concern).
Yes. Creating a cl ix will physicall re-sort the table, during you will have exclusive lock (unless
you are on 2005 and create using ONLINE option). Creating nc index will not re-sort data, only build
the index during which you have shared lock (unless using ONLINE). Adding an identity column will
proably mean SQL Server has to touch every row, end result can be similar to creating a cl ix
(sometimes, this can be deferred, but I very very much doubt adding an identity column can be
deferred, considering the identity value need to be populated).
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://www.solidqualitylearning.com/
"jb" <b@.b.com> wrote in message news:OoI2jPCfHHA.4636@.TK2MSFTNGP03.phx.gbl...
> Hi,
> We have an largish audit table with many columns. There are inserts/deletes only on this table, no
> updates. Its never been indexed or had a pk. Reports are run against this table, usually on date
> ranges (datetime col) with joins to other tables on userid etc.
> Recently an identity column was added, to improve performance. For start I'm wondering would this
> do anything at all for performance? Presumably the table is sorted by this field in the absence of
> any other pk/index? If so what benefit is this if the column is never used?
> Surely the best strategy would be to put a clustered index on the datetime field and not bother
> with the identity?
> Is there much difference in time overhead adding a clustered, non-clustered or identity field to
> existing tables with many rows (this database is used by many client and the time taken for the
> schema update is a concern).
> Thanks
> JB.|||It is possible that adding an identity might make index rows on other
indexes smaller (if any other indexes exist). I know the OP stated there
were no other indexes on this particular table, but for argument's sake,
consider what happens when a heap table with NCIXs has an integer identity
column added. The ROWIDs in the NCIXs are reduced in size from 8bytes to
4bytes, reducing the storage size of those NCIXs. For NCIXs with only a
column or two, this can be a substantial reduction in overall size of the
index..
Is this important? Probably not terribly much, but I just thought it's worth
pointing out that although adding an identity column does widen the width of
table row storage, it can actually have the reverse effect on non-clustered
indexes. If performance critical queries rely on those NCIXs, it can
sometimes be a useful technique.. Where the identity is a BigInt however,
there's obviously no such gain..
Regards,
Greg Linwood
SQL Server MVP
http://blogs.sqlserver.org.au/blogs/greg_linwood
"Tibor Karaszi" <tibor_please.no.email_karaszi@.hotmail.nomail.com> wrote in
message news:%2340HOgCfHHA.3932@.TK2MSFTNGP02.phx.gbl...
>> Recently an identity column was added, to improve performance. For start
>> I'm wondering would this do anything at all for performance?
> No. Well, it will increase row size a little so it will decrease
> performance slightly.
>
>> Presumably the table is sorted by this field in the absence of any other
>> pk/index?
> Nope. Identity is a logical construct, and will have no bearing on
> physical structure.
>
>> If so what benefit is this if the column is never used?
> None, from a performance standpoint. From a logical standpoint, we get
> into the discussion of identifying rows, natural vs. surrogate keys, but
> that is another topic.
>
>> Surely the best strategy would be to put a clustered index on the
>> datetime field and not bother with the identity?
> That seems like the natural thing to do. However, another candidate is to
> cluster over the join column(s). Do some testing what will give most gain.
> Whever you cluster on, you probably want to have non-clustered index(es)
> on the other(s).
>
>> Is there much difference in time overhead adding a clustered,
>> non-clustered or identity field to existing tables with many rows (this
>> database is used by many client and the time taken for the schema update
>> is a concern).
> Yes. Creating a cl ix will physicall re-sort the table, during you will
> have exclusive lock (unless you are on 2005 and create using ONLINE
> option). Creating nc index will not re-sort data, only build the index
> during which you have shared lock (unless using ONLINE). Adding an
> identity column will proably mean SQL Server has to touch every row, end
> result can be similar to creating a cl ix (sometimes, this can be
> deferred, but I very very much doubt adding an identity column can be
> deferred, considering the identity value need to be populated).
> --
> Tibor Karaszi, SQL Server MVP
> http://www.karaszi.com/sqlserver/default.asp
> http://www.solidqualitylearning.com/
>
> "jb" <b@.b.com> wrote in message
> news:OoI2jPCfHHA.4636@.TK2MSFTNGP03.phx.gbl...
>> Hi,
>> We have an largish audit table with many columns. There are
>> inserts/deletes only on this table, no updates. Its never been indexed or
>> had a pk. Reports are run against this table, usually on date ranges
>> (datetime col) with joins to other tables on userid etc.
>> Recently an identity column was added, to improve performance. For start
>> I'm wondering would this do anything at all for performance? Presumably
>> the table is sorted by this field in the absence of any other pk/index?
>> If so what benefit is this if the column is never used?
>> Surely the best strategy would be to put a clustered index on the
>> datetime field and not bother with the identity?
>> Is there much difference in time overhead adding a clustered,
>> non-clustered or identity field to existing tables with many rows (this
>> database is used by many client and the time taken for the schema update
>> is a concern).
>> Thanks
>> JB.
>|||>> Yes. Creating a cl ix will physicall re-sort the table, during you will
>> have exclusive lock (unless you are on 2005 and create using ONLINE
>> option). Creating nc index will not re-sort data, only build the index
>> during which you have shared lock (unless using ONLINE). Adding an
>> identity column will proably mean SQL Server has to touch every row, end
>> result can be similar to creating a cl ix (sometimes, this can be
>> deferred, but I very very much doubt adding an identity column can be
>> deferred, considering the identity value need to be populated).
Tested this on 2.2 million records:
Clustered [on datetime field] - 5 mins
Non-clustered - 30 secs
Adding identity - ~1min

Wednesday, March 21, 2012

identity, indexes, datetime

No comments:

Post a Comment

Identity column

Blog Archive

About Me