Showing posts with label keys. Show all posts
Showing posts with label keys. Show all posts

Wednesday, March 21, 2012

Identity Specification limit

what happens when a column marked as Identity Specification reaches the limit? for example, I have some code tables using tinyints as keys, the actual number of entries will be 20 or so but there is some volatility, so eventually the 255 limit will be reached, what happens then?

the same thing applies to ints or bigints used as keys, eventually the database must run out of numbers

information will be appreciated

David Wilson.

Hi,

When the maximum has been reached, an error will be generated. Here is an example:

--CREATE TABLE IDENTITYTEST
--(
-- ID TINYINT IDENTITY(1, 1),
-- TEXTVALUE VARCHAR(50)
--)

DECLARE @.COUNTER INT
SET @.COUNTER = 0

WHILE @.COUNTER < 260
BEGIN
INSERT INTO IDENTITYTEST(TEXTVALUE) VALUES ('VALUE ' + CAST(@.COUNTER AS VARCHAR(3)))
SET @.COUNTER = @.COUNTER + 1
END

/*
RESULT:
Msg 8115, Level 16, State 1, Line 12
Arithmetic overflow error converting IDENTITY to data type tinyint.
Arithmetic overflow occurred.
*/

Reference: http://msdn2.microsoft.com/en-us/library/ms186775.aspx

Greetz,

Geert

Geert Verhoeven
Consultant @. Ausy Belgium

My Personal Blog

|||

right, I did about the same thing (not quite as elegant :-)) - with the same result

the question is - How do you fix it? and how do you code something into a daily checkup routine or the like to find it and fix it before it happens?

Monday, March 19, 2012

Identity Ranges

I'm using Merge replication on a database that was designed using integer identity columns for primary keys. When I create a publisher it's great because Sql Server will create rowguid columns for me on most of the tables; actually all but one table.

The problem comes when I try and use identity ranges for the subscribers. As a test I set up a range on one table, it only allowed for 10 in the range with an 80% threshold. I wanted to see what would happen when say my publisher db inserts 11 new rows. Well, I found out that after the 8th new row it wouldn't insert anymore as the range was exceeded and it gave me an error message saying
The identity range managed by replication is full and must be updated by a replication agent.

The question is; which agent needs to run in order for the new range to be assigned to the publisher? I have seen some people talk about an sp_ that can be run, but in a production environment I wont this to be automatic.

My altenative to ranged identies is using guid uniqueidentifiers. See my other post on this!!

regards

GrahamI assume this is SQL Server 2000.

To answer your questions:
Maybe the table on which merge is not adding the rowguid col is because the table already has a rowguid column?

And since you have 80% threashold, it is failing after the 8th row and I assume you have pub_idrange and range values each set to 10.
You can get by this situation by increasing the numbers, so that the probablity of them running out of numbers is less. Say like 10,000 or 100,000.

And the agent to be run when the id range is full is the merge agent. Running the merge agent will refresh the ranges on publisher and subscriber.
On the publisher, you can also run the sp_adjustpublisheridentityrange to refresh the range and that way you dont need to run the merge agent.

But typically in a production scenario, it is recommended that:
1. You have a decent sized value for the publisher and subscriber id range values
2. Merge agent to run frequently. That way the ranges will be refreshed (if needed) when merge agent runs.

Monday, March 12, 2012

Identity or GUID?

OK Guys,
We are considering replacing the standard IDENTITY column with the UNIQUEIDE
NTIFIER column for all our primary keys.
Good? Bad? Crazy? WTF!?
Any thoughts on index clustering, performance, portability, managability, et
c... would be appreciated.
RobertHi
http://www.sql-server-performance.c...red_indexes.asp
"rmg66" <rgwathney__xXx__primepro.com> wrote in message news:eZPZpnWiGHA.428
4@.TK2MSFTNGP05.phx.gbl...
OK Guys,
We are considering replacing the standard IDENTITY column with the UNIQUEIDE
NTIFIER column for all our primary keys.
Good? Bad? Crazy? WTF!?
Any thoughts on index clustering, performance, portability, managability, et
c... would be appreciated.
Robert|||From what I have read and understand Indexes which gets built on GUID wud be
pretty heavy and so it might have a performance hit.
http://www.thescripts.com/forum/thread82632.html -- this might help.
Best Regards
Vadivel
http://vadivel.blogspot.com
"rmg66" wrote:

> OK Guys,
> We are considering replacing the standard IDENTITY column with the UNIQUEI
DENTIFIER column for all our primary keys.
> Good? Bad? Crazy? WTF!?
> Any thoughts on index clustering, performance, portability, managability,
etc... would be appreciated.
> Robert
>|||Hi Robert,
Why? The only reason I can think is that you are moving more to a
distributed database architecture and you want to guarentee that the
surrogate key (its not really a primary key, the primary key is part of your
data) is unique across databases.
You can still use IDENTITY but encode a site ID into the schema.
NEWID() is random in its generation so the insert will be random across your
index so, you will cause additional IO because the data will be more spread
across the disk (array) so, you might end up with more locking contention
too.
In a word - don't do it.
Oh, also - its a lot harder to debug and 'see' guids when you are working
with the data under DBA mode ;).
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"rmg66" <rgwathney__xXx__primepro.com> wrote in message
news:eZPZpnWiGHA.4284@.TK2MSFTNGP05.phx.gbl...
OK Guys,
We are considering replacing the standard IDENTITY column with the
UNIQUEIDENTIFIER column for all our primary keys.
Good? Bad? Crazy? WTF!?
Any thoughts on index clustering, performance, portability, managability,
etc... would be appreciated.
Robert|||2005 will have a new function called:
newsequentialid
http://msdn2.microsoft.com/en-us/library/ms189786.aspx
for 2000
http://www.sqldev.net/xp/xpguid.htm
is an option to overcome the "randomness" of NEWID()
There is a performance hit for using NEWID() in 2000. I won't deny that.
However... .
The big advantage of using GUIDS is that I can
Create my Relationships OUTSIDE of tsql code, aka, (for me) inside DotNet
code.
Read my previous post at:
http://groups.google.com/group/micr...be27c58c993dab7
I don't know if there is a super correct answer.
It depends on what you got going on.
Personally, me and my company are making great strides to get the business
logic OUT OF THE Database, and into the business layer.
See
http://www.codeproject.com/gen/desi...sinessLogic.asp
for more info
Most times, I'm going with some kind of GUID usage, but not the NEWID stuff.
If replication is in your plans, then you need to seriously consider
abandoning IDENTITY's.
But you should research and judge for yourself.
"Tony Rogerson" <tonyrogerson@.sqlserverfaq.com> wrote in message
news:%23ih$yCXiGHA.3904@.TK2MSFTNGP02.phx.gbl...
> Hi Robert,
> Why? The only reason I can think is that you are moving more to a
> distributed database architecture and you want to guarentee that the
> surrogate key (its not really a primary key, the primary key is part of
your
> data) is unique across databases.
> You can still use IDENTITY but encode a site ID into the schema.
> NEWID() is random in its generation so the insert will be random across
your
> index so, you will cause additional IO because the data will be more
spread
> across the disk (array) so, you might end up with more locking contention
> too.
> In a word - don't do it.
> Oh, also - its a lot harder to debug and 'see' guids when you are working

> with the data under DBA mode ;).
> --
> Tony Rogerson
> SQL Server MVP
> http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a
SQL
> Server Consultant
> http://sqlserverfaq.com - free video tutorials
>
> "rmg66" <rgwathney__xXx__primepro.com> wrote in message
> news:eZPZpnWiGHA.4284@.TK2MSFTNGP05.phx.gbl...
> OK Guys,
> We are considering replacing the standard IDENTITY column with the
> UNIQUEIDENTIFIER column for all our primary keys.
> Good? Bad? Crazy? WTF!?
> Any thoughts on index clustering, performance, portability, managability,
> etc... would be appreciated.
> Robert
>
>|||My organization is in the midst of trying to replace our UNIQUEIDENTIFIER
clusetered primary keys with IDENTITY fields. Two reasons: 1) making the
clustered index a UNIQUEIDENTIFIER field increases the size of all the
nonclustered indexes; and 2) UNIQUEIDENTIFIER fields generated with the
NEWID() function are not sequential, so your joins will be much less
efficient.
Oh, and I forgot the last one: changing back is a pain!
"rmg66" <rgwathney__xXx__primepro.com> wrote in message
news:eZPZpnWiGHA.4284@.TK2MSFTNGP05.phx.gbl...
OK Guys,
We are considering replacing the standard IDENTITY column with the
UNIQUEIDENTIFIER column for all our primary keys.
Good? Bad? Crazy? WTF!?
Any thoughts on index clustering, performance, portability, managability,
etc... would be appreciated.
Robert|||> I don't know if there is a super correct answer.
> It depends on what you got going on.
> Personally, me and my company are making great strides to get the business
> logic OUT OF THE Database, and into the business layer.
>
You've missed the boat Sloan, the current thinking is to put the business
logic back into the database because its centralised and easier to manage -
plus you get better resource usage through cached execution code etc...
Google Jim Gray and look up some of his thinking on this.
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"sloan" <sloan@.ipass.net> wrote in message
news:%23UU$cfXiGHA.3408@.TK2MSFTNGP05.phx.gbl...
> 2005 will have a new function called:
> newsequentialid
> http://msdn2.microsoft.com/en-us/library/ms189786.aspx
> for 2000
> http://www.sqldev.net/xp/xpguid.htm
> is an option to overcome the "randomness" of NEWID()
>
> There is a performance hit for using NEWID() in 2000. I won't deny that.
> However... .
> The big advantage of using GUIDS is that I can
> Create my Relationships OUTSIDE of tsql code, aka, (for me) inside DotNet
> code.
>
> Read my previous post at:
> http://groups.google.com/group/micr...be27c58c993dab7
>
> I don't know if there is a super correct answer.
> It depends on what you got going on.
> Personally, me and my company are making great strides to get the business
> logic OUT OF THE Database, and into the business layer.
> See
> http://www.codeproject.com/gen/desi...sinessLogic.asp
> for more info
>
> Most times, I'm going with some kind of GUID usage, but not the NEWID
> stuff.
> If replication is in your plans, then you need to seriously consider
> abandoning IDENTITY's.
> But you should research and judge for yourself.
>
> "Tony Rogerson" <tonyrogerson@.sqlserverfaq.com> wrote in message
> news:%23ih$yCXiGHA.3904@.TK2MSFTNGP02.phx.gbl...
> your
> your
> spread
>
> SQL
>|||Why are you considering this? What gains do you expect, or why do you
think you need GUIDs? Aren't Identities working for you?
Have you seen the presentation of Kimberly Tripp on Index Optimization?
(see
http://www.microsoft.com/uk/technet...aspx?videoid=29)
In a fragment of it, she discusses the use of GUIDs as the clustered
index key, and how this use can cause massive fragmentation and (as a
result) abysmal performance.
BTW: there are different opinions about the policy to use an Identity
(or other surrogate key) as the Primary Key for each table. My personal
opinion is, that such a key should never be the first choice. So IMO you
should not have such a policy. One should always try to find a natural
key, and only choose a surrogate key if no useful natural key is found,
or for performance reasons (which means the natural key would still be
an alternate key, enforced with a Unique constraint).
HTH,
Gert-Jan

> rmg66 wrote:
> OK Guys,
> We are considering replacing the standard IDENTITY column with the
> UNIQUEIDENTIFIER column for all our primary keys.
> Good? Bad? Crazy? WTF!?
> Any thoughts on index clustering, performance, portability,
> managability, etc... would be appreciated.
> Robert
>|||Not trying to pick a fight, Tony, but you're one of the few that I've
seen advocate this. I haven't read any of Jim Gray's stuff, but I'll
take a look at it. It seems odd to me to put business logic back into
the database because of the issues of object-relational impedance,
scalability, and general performance considerations.
I prefer clean seperation, myself; I'm even beginning tto question the
need for stored procs because of the failure to seperate business logic
from data retrieval.
Stu
Tony Rogerson wrote:
> You've missed the boat Sloan, the current thinking is to put the business
> logic back into the database because its centralised and easier to manage
-
> plus you get better resource usage through cached execution code etc...
> Google Jim Gray and look up some of his thinking on this.
> --
> Tony Rogerson
> SQL Server MVP
> http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a S
QL
> Server Consultant
> http://sqlserverfaq.com - free video tutorials
>
> "sloan" <sloan@.ipass.net> wrote in message
> news:%23UU$cfXiGHA.3408@.TK2MSFTNGP05.phx.gbl...|||Well, that's great if you're married to Sql Server.
I love Sql Server, don't get me wrong, its my bread and butter.
But you can't guarantee you'll always be in a Sql Server world.
And I don't think I've "missed the boat".
My DataLayer objects return:
IDataReader's
DataSets (typed and untyped)
XmlDocuments
Scalars
voids (or nothings... as in, just make sure what I called worked)
Because I have a good DataLayer, I can switch out the backend database at
any given moment.
Yeah, there will be some issues, but not as drastic as complicated business
logic in my tsql.
The database is usually the bottleneck of any well designed system.
And the quicker I get in and get out, the better.
I'll take a look at Jim Gray's stuff. (is this the same Jim Gray who does
interviews for espn/nba?)
That's fine to say "There are other options out there, which have BL in the
database"
But "you missed the boat", ... thats a little strong for for advocating
another opinion in favor of what I and alot of others have proposed.
Perhaps the experience of having my company merge with another company, with
diffrent RDBMS systems has influenced me somewhat.
I'll stick with the good DataLayer design for now.
"Tony Rogerson" <tonyrogerson@.sqlserverfaq.com> wrote in message
news:OHgQiOaiGHA.4284@.TK2MSFTNGP05.phx.gbl...
business
> You've missed the boat Sloan, the current thinking is to put the business
> logic back into the database because its centralised and easier to
manage -
> plus you get better resource usage through cached execution code etc...
> Google Jim Gray and look up some of his thinking on this.
> --
> Tony Rogerson
> SQL Server MVP
> http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a
SQL
> Server Consultant
> http://sqlserverfaq.com - free video tutorials
>
> "sloan" <sloan@.ipass.net> wrote in message
> news:%23UU$cfXiGHA.3408@.TK2MSFTNGP05.phx.gbl...
that.
DotNet
http://groups.google.com/group/micr...993dab7

business
contention
working
a
managability,
>

Identity keys in a One-to-Zero-or-Many relationship

Considering the following:
CREATE TABLE CustomerTypes
(
PKCustomerType INT IDENTITY (1,1) NOT NULL,
CustomerTypeDesc VARCHAR(30)
)
GO
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail')
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale')
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate')
CREATE TABLE Customers
(
PKCustomer INT IDENTITY (1,1) NOT NULL,
CustomerID VARCHAR(20),
FKCustomerType INT
)
GO
INSERT Customers (CustomerID, FKCustomerType) Values ('CUST001', 1)
INSERT Customers Values ('CUST002', 3)
INSERT Customers Values ('CUST003', 2)
How should I handle a case where I don't want to specify a customer
type for whatever reason? Perhaps it is unknown or not applicable in
some cases. I could add a 'None or N/A' record in the CustomerTypes
table for that purpose, but when validating the data in an INSERT
stored procedure, for example, I would like to have a value to use as
default in the case of a missing of invalid field value being passed
for FKCustomerType. If it weren't an identity field in the
CustomerTypes table, I could just create a CustomerType record with a
PKCustomerType value of 0 and always use that value in such cases. With
it being an identity field, however, I can't count on a specific value
for that purpose. It seems I have two options:
1) Create the CustomerTypes table without PKCustomerType being an
identity field...insert the 'None or N/A' record, assigning it a
PKCustomerType value of, say, 0...alter the table to make
PKCustomerType and identity field (1,1)
2) Merely use NULL in Customers where CustomerType isn't specified.
Is there a better way? If not, any insight on why one approach would be
better than the other? I'm leaning toward option 2 at this point.> 2) Merely use NULL in Customers where CustomerType isn't specified.
If CustomerType is not specified because it is unknown, then this may be a
valid approach.
However, you may want to give users the option of saying Unknown, and have
it be a value in the CustomerTypes table. Then if you later want to
differentiate between unknown and unspecified, you can add an option.
There is no reason to hard-code the default value in the parameter to the
stored procedure. Have the application generate its dropdown from the
CustomerTypes table, and you are done.
A|||>> Considering the following: <<
Let's fix what you posted so that it follows the most basic RDBMS
design principles. For example, why do you have no keys? Why did you
think that IDENTITY is every used? Why did you use PK- prefixes in
violation of ISO-11179 rules?
CREATE TABLE CustomerTypes
(customer_type INTEGER NOT NULL PRIMARY KEY,
customer_type_desc VARCHAR(30) NOT NULL);
I see from the use of IDENTITY that you did not bother with designing
an encoding scheme. You might want to learn how to do that
CREATE TABLE Customers
(customer_id VARCHAR(20) NOT NULL PRIMARY KEY,
(customer_type INTEGER NOT NULL
REFERENCES CustomerTypes (customer_type)
ON UPDATE CASCADE,
. );
VARCHAR(20) is a bit long, but by the definition of an identifier, this
has to be your key. Frankly, I would look for a DUNS number or some
indusrty standard code.
Stop putting prefixes that tell you **how** a data element is used in a
table. The name of a data element is supposed to tell you **what** it
is.
Then you need codes for those situations if they are logically
different. For example, the ICD codes for disease have '000.000' which
means "undiagnosed" and '999.999' which means "we did all the test and
still don't know!" -- very different kinds of missing data! VERY
IMPORTANT to distinguish them!! Matter of life and death, in fact.
] value being passed for FKCustomerType [sic]. <<
Now we are getting to your REAL problem. Let's get back to the basics
of an RDBMS. Rows are not records; fields are not columns; tables are
not files; there is no sequential access or ordering in an RDBMS, so
faking a record number or the lines on a piece of paper with a
proprietary IDENTITY property is dead wrong. You do not have the right
mindset, and all you are going to get on a Newsgroup is a few kludges
to help you fake it for a long period of time before the collapse.
a specific value for that purpose. <<
Surprise! Surprise! Surprise! See how non-relational, non-verifiable,
non-portable proprietary extensions screw up things?
Yes; do it right or kludge it :)
<<
Almost! Stop using IDENTITY. This is an RDBMS and not a sequential
file system.
Maybe; do you need to know anything about the missing values? Or just
that it is missing? Get a copy of SQL PROGRAMMING STYLE, SQL FOR
SMARTIES or DATA & DATABASES and then read the chapters on scales &
measurements, and how to design encoding schemes. The research pattern
is simple:
1) Look for industry standards (Google it!)
2) Look for company standards (see the accounting department for help)
3) In the remaining 5% of the cases where you have to invent something,
pick a type of encoding and follow the rules for good design. My guess
in this example is a hierachy or vector code because customers break
down into tax/tax exempt, domestic/foreign and so forth within
retailers and wholesalers.|||> design principles. For example, why do you have no keys? Why did you
> think that IDENTITY is every used? Why did you use PK- prefixes in
Why do you think SURROGATE KEYS are never used? It would appear you are one
of the only people in the industry who do not use them.
Perhaps its an implementation experience thing compared to theory.

> Almost! Stop using IDENTITY. This is an RDBMS and not a sequential
> file system.
Your bias and misunderstanding of how IDENTITY works is embarrasing - READ
THE MANUAL!

> that it is missing? Get a copy of SQL PROGRAMMING STYLE, SQL FOR
> SMARTIES or DATA & DATABASES and then read the chapters on scales &
Only if you want theory, for practical advice get a SQL Server specific
book.
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"--CELKO--" <jcelko212@.earthlink.net> wrote in message
news:1151601604.581696.66800@.75g2000cwc.googlegroups.com...
> Let's fix what you posted so that it follows the most basic RDBMS
> design principles. For example, why do you have no keys? Why did you
> think that IDENTITY is every used? Why did you use PK- prefixes in
> violation of ISO-11179 rules?
> CREATE TABLE CustomerTypes
> (customer_type INTEGER NOT NULL PRIMARY KEY,
> customer_type_desc VARCHAR(30) NOT NULL);
> I see from the use of IDENTITY that you did not bother with designing
> an encoding scheme. You might want to learn how to do that
> CREATE TABLE Customers
> (customer_id VARCHAR(20) NOT NULL PRIMARY KEY,
> (customer_type INTEGER NOT NULL
> REFERENCES CustomerTypes (customer_type)
> ON UPDATE CASCADE,
> .. );
> VARCHAR(20) is a bit long, but by the definition of an identifier, this
> has to be your key. Frankly, I would look for a DUNS number or some
> indusrty standard code.
> Stop putting prefixes that tell you **how** a data element is used in a
> table. The name of a data element is supposed to tell you **what** it
> is.
>
> Then you need codes for those situations if they are logically
> different. For example, the ICD codes for disease have '000.000' which
> means "undiagnosed" and '999.999' which means "we did all the test and
> still don't know!" -- very different kinds of missing data! VERY
> IMPORTANT to distinguish them!! Matter of life and death, in fact.
>
> Now we are getting to your REAL problem. Let's get back to the basics
> of an RDBMS. Rows are not records; fields are not columns; tables are
> not files; there is no sequential access or ordering in an RDBMS, so
> faking a record number or the lines on a piece of paper with a
> proprietary IDENTITY property is dead wrong. You do not have the right
> mindset, and all you are going to get on a Newsgroup is a few kludges
> to help you fake it for a long period of time before the collapse.
>
> Surprise! Surprise! Surprise! See how non-relational, non-verifiable,
> non-portable proprietary extensions screw up things?
>
> Yes; do it right or kludge it :)
>
> Almost! Stop using IDENTITY. This is an RDBMS and not a sequential
> file system.
>
> Maybe; do you need to know anything about the missing values? Or just
> that it is missing? Get a copy of SQL PROGRAMMING STYLE, SQL FOR
> SMARTIES or DATA & DATABASES and then read the chapters on scales &
> measurements, and how to design encoding schemes. The research pattern
> is simple:
> 1) Look for industry standards (Google it!)
> 2) Look for company standards (see the accounting department for help)
> 3) In the remaining 5% of the cases where you have to invent something,
> pick a type of encoding and follow the rules for good design. My guess
> in this example is a hierachy or vector code because customers break
> down into tax/tax exempt, domestic/foreign and so forth within
> retailers and wholesalers.
>|||> How should I handle a case where I don't want to specify a customer
> type for whatever reason? Perhaps it is unknown or not applicable in
It should be NULL because its not specified, if you want to give it a value
you really ought to have a CustomerType of 'Not Specified'.

> CREATE TABLE CustomerTypes
> (
> PKCustomerType INT IDENTITY (1,1) NOT NULL,
> CustomerTypeDesc VARCHAR(30)
> )
> GO
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate')
> CREATE TABLE Customers
> (
> PKCustomer INT IDENTITY (1,1) NOT NULL,
> CustomerID VARCHAR(20),
> FKCustomerType INT
> )
> GO
> INSERT Customers (CustomerID, FKCustomerType) Values ('CUST001', 1)
> INSERT Customers Values ('CUST002', 3)
> INSERT Customers Values ('CUST003', 2)
You probably need to have a slight rethink on the constraints....
CREATE TABLE CustomerTypes
(
id INT IDENTITY (1,1) NOT NULL constraint sk_customertypes unique
clustered,
CustomerTypeDesc VARCHAR(30) constraint pk_customertype primary key
nonclustered
)
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail')
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale')
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate')
CREATE TABLE Customers
(
id INT IDENTITY (1,1) NOT NULL constraint sk_customer unique
clustered,
CustomerID VARCHAR(20) not null constraint pk_customer primary key
nonclustered,
CustomerType_id INT NULL references CustomerTypes( id )
)
INSERT Customers (CustomerID, CustomerType_id) Values ('CUST001', 1)
INSERT Customers (CustomerID, CustomerType_id) Values ('CUST003', NULL)
Tony.
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"Richard Carpenter" <rumbledor@.hotmail.com> wrote in message
news:1151594340.887702.165300@.b68g2000cwa.googlegroups.com...
> Considering the following:
> CREATE TABLE CustomerTypes
> (
> PKCustomerType INT IDENTITY (1,1) NOT NULL,
> CustomerTypeDesc VARCHAR(30)
> )
> GO
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate')
> CREATE TABLE Customers
> (
> PKCustomer INT IDENTITY (1,1) NOT NULL,
> CustomerID VARCHAR(20),
> FKCustomerType INT
> )
> GO
> INSERT Customers (CustomerID, FKCustomerType) Values ('CUST001', 1)
> INSERT Customers Values ('CUST002', 3)
> INSERT Customers Values ('CUST003', 2)
> How should I handle a case where I don't want to specify a customer
> type for whatever reason? Perhaps it is unknown or not applicable in
> some cases. I could add a 'None or N/A' record in the CustomerTypes
> table for that purpose, but when validating the data in an INSERT
> stored procedure, for example, I would like to have a value to use as
> default in the case of a missing of invalid field value being passed
> for FKCustomerType. If it weren't an identity field in the
> CustomerTypes table, I could just create a CustomerType record with a
> PKCustomerType value of 0 and always use that value in such cases. With
> it being an identity field, however, I can't count on a specific value
> for that purpose. It seems I have two options:
> 1) Create the CustomerTypes table without PKCustomerType being an
> identity field...insert the 'None or N/A' record, assigning it a
> PKCustomerType value of, say, 0...alter the table to make
> PKCustomerType and identity field (1,1)
> 2) Merely use NULL in Customers where CustomerType isn't specified.
> Is there a better way? If not, any insight on why one approach would be
> better than the other? I'm leaning toward option 2 at this point.
>|||Why not use a referential integrity constraint?
Unknown and Inapplicable are distinct reasons attributed for missing data.
And therefore, if your business demands that distinction, it makes sense to
use separate values to represent them in a table.
Anith|||Just to add, inapplicable attributes are best addressed using an entity
super-type/sub-type relationship. However, commonly people use to kludge
them with NULLs as an easy workaround.
Anith|||"Richard Carpenter" <rumbledor@.hotmail.com> wrote in message
news:1151594340.887702.165300@.b68g2000cwa.googlegroups.com...
> Considering the following:
> CREATE TABLE CustomerTypes
> (
> PKCustomerType INT IDENTITY (1,1) NOT NULL,
> CustomerTypeDesc VARCHAR(30)
> )
> GO
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale')
> INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate')
> CREATE TABLE Customers
> (
> PKCustomer INT IDENTITY (1,1) NOT NULL,
> CustomerID VARCHAR(20),
> FKCustomerType INT
> )
> GO
> INSERT Customers (CustomerID, FKCustomerType) Values ('CUST001', 1)
> INSERT Customers Values ('CUST002', 3)
> INSERT Customers Values ('CUST003', 2)
> How should I handle a case where I don't want to specify a customer
> type for whatever reason? Perhaps it is unknown or not applicable in
> some cases. I could add a 'None or N/A' record in the CustomerTypes
> table for that purpose, but when validating the data in an INSERT
> stored procedure, for example, I would like to have a value to use as
> default in the case of a missing of invalid field value being passed
> for FKCustomerType. If it weren't an identity field in the
> CustomerTypes table, I could just create a CustomerType record with a
> PKCustomerType value of 0 and always use that value in such cases. With
> it being an identity field, however, I can't count on a specific value
> for that purpose. It seems I have two options:
> 1) Create the CustomerTypes table without PKCustomerType being an
> identity field...insert the 'None or N/A' record, assigning it a
> PKCustomerType value of, say, 0...alter the table to make
> PKCustomerType and identity field (1,1)
> 2) Merely use NULL in Customers where CustomerType isn't specified.
> Is there a better way? If not, any insight on why one approach would be
> better than the other? I'm leaning toward option 2 at this point.
>
If you add some keys to the CustomerTypes table you will be able to identify
the "Unknown" value by its logical key instead of the surrogate.
The "Inapplicable" case would probably be better handled by decomposing the
Customers table.
CREATE TABLE CustomerTypes
(
PKCustomerType INT IDENTITY (1,1) NOT NULL
CONSTRAINT PK_CustomerTypes PRIMARY KEY,
CustomerTypeDesc VARCHAR(30) NOT NULL
CONSTRAINT AK1_CustomerTypes UNIQUE
);
GO
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Retail');
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Wholesale');
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Corporate');
INSERT CustomerTypes (CustomerTypeDesc) VALUES ('Unknown');
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Good gravy! I'll try to respond to your individual points without
taking too much offense to your abrasive tone...
--CELKO-- wrote:
> Let's fix what you posted so that it follows the most basic RDBMS
> design principles.
Really, don't trouble yourself. It was a simple example I threw
together to try and illustrate my question. Nothing more. I wasn't
shooting for theoretical precision. I didn't (and still don't, as I
will explain in a moment) think it that necessary in this case.

> For example, why do you have no keys? Why did you
> think that IDENTITY is every used? Why did you use PK- prefixes in
> violation of ISO-11179 rules?
> CREATE TABLE CustomerTypes
> (customer_type INTEGER NOT NULL PRIMARY KEY,
> customer_type_desc VARCHAR(30) NOT NULL);
> I see from the use of IDENTITY that you did not bother with designing
> an encoding scheme. You might want to learn how to do that
> CREATE TABLE Customers
> (customer_id VARCHAR(20) NOT NULL PRIMARY KEY,
> (customer_type INTEGER NOT NULL
> REFERENCES CustomerTypes (customer_type)
> ON UPDATE CASCADE,
> .. );
> VARCHAR(20) is a bit long, but by the definition of an identifier, this
> has to be your key. Frankly, I would look for a DUNS number or some
> indusrty standard code.
It is what it is. If my particular (hypothetical, I might add) business
process requires the use of specific proprietary customer types, then
why complicate it?

> Stop putting prefixes that tell you **how** a data element is used in a
> table. The name of a data element is supposed to tell you **what** it
> is.
I really don't see the significance of the distinction. The fact that
the name of the field starts with PK *does* tell me what it is. I
really think this one is a matter of personal preference.

> Then you need codes for those situations if they are logically
> different. For example, the ICD codes for disease have '000.000' which
> means "undiagnosed" and '999.999' which means "we did all the test and
> still don't know!" -- very different kinds of missing data! VERY
> IMPORTANT to distinguish them!! Matter of life and death, in fact.
A bit dramatic, but I hear what you're saying. However, my question
really doesn't have anything to do with what values are available to
the user, but how to handle it when the user doesn't feel inclined to
specify a fact that is not strictly required. On the application's data
entry screen, the user may only need to worry about some of the
available fields. There is no reason to force them to touch every one
when not all are relevant in every case. My question is, how to best
handle that foreign key [column] value in that case.

ic] value being passed for FKCustomerType [sic]. <<
> Now we are getting to your REAL problem. Let's get back to the basics
> of an RDBMS. Rows are not records; fields are not columns; tables are
> not files; there is no sequential access or ordering in an RDBMS, so
> faking a record number or the lines on a piece of paper with a
> proprietary IDENTITY property is dead wrong. You do not have the right
> mindset, and all you are going to get on a Newsgroup is a few kludges
> to help you fake it for a long period of time before the collapse.
Yeah, I'm guilty of referring to them in the very manner that makes a
certain type of person grit their teeth. I think that since we both
know exactly what I mean, then it really isn't an issue worth allowing
to cloud the matter. Wouldn't you agree?

on a specific value for that purpose. <<
> Surprise! Surprise! Surprise! See how non-relational, non-verifiable,
> non-portable proprietary extensions screw up things?
>
Hence my original question. Again, I really don't understand why it has
to be this difficult. How would *you* go about it, if forcing the user
to touch every field on the data entry form is *not* acceptable?

> Yes; do it right or kludge it :)
>
1) <<
> Almost! Stop using IDENTITY. This is an RDBMS and not a sequential
> file system.
There is nothing sequential about the requirements I've established
here. I don't care if the key is sequentially numbered, random
alpha-numeric or caveman hieroglyphics. All I require is that it be
guaranteed to be unique and generated automatically.

> Maybe; do you need to know anything about the missing values? Or just
> that it is missing? Get a copy of SQL PROGRAMMING STYLE, SQL FOR
> SMARTIES or DATA & DATABASES and then read the chapters on scales &
> measurements, and how to design encoding schemes. The research pattern
> is simple:
> 1) Look for industry standards (Google it!)
> 2) Look for company standards (see the accounting department for help)
> 3) In the remaining 5% of the cases where you have to invent something,
> pick a type of encoding and follow the rules for good design. My guess
> in this example is a hierachy or vector code because customers break
> down into tax/tax exempt, domestic/foreign and so forth within
> retailers and wholesalers.
Honestly, that just seemed like a whole lot of unnecessary nit-picking
and tangent traversal than was really required here. I can understand
if you feel in your infinite wisdom that it is better to suggest to me
where else I might direct my search to find the answer than give it to
me directly, but to put so much effort into avoiding answering my
question while at the same time trying to educate me (or, arguably more
accurate, indoctrinate me according to your own personal opinions) in
related regards does make me wonder who's best interest you have at
heart here.|||Anith Sen wrote:
> Why not use a referential integrity constraint?
> Unknown and Inapplicable are distinct reasons attributed for missing data.
> And therefore, if your business demands that distinction, it makes sense t
o
> use separate values to represent them in a table.
>
That would be my typical thinking as well, but RI doesn't work if the
value is not required unless a default constraint is specified. With an
identity key, the default value for the foreign key column cannot be
specified in the table definition.
On second thought, a better illustration of my question would be a name
suffix (Jr., Sr. III, etc.). A person won't necessarily *have* a suffix
in their name, yet forcing the user to select "None" from a list is
perhaps a bit cumbersome from a UI standpoint. If that suffix column in
the Customers table is a foreign key referencing the primary key column
in a reference table, and that primary key column is an identity type
key, then it would seem that the only workable foreign key value would
be NULL.

Friday, March 9, 2012

Identity Increment

Hello, I need some help writing a script to generate Identity keys. I cannot use the row number generator because I would like to start the identity at a package level variable. Is this possible?
Thank you in advance

I just find this using a web search engine:

http://www.sqljunkies.com/WebLog/sqlbi/archive/2005/05/30/15684.aspx

I know that has been also discussed in this forum; I recommned you tu use the search functionality of the forum

|||http://www.ssistalk.com/2007/02/20/generating-surrogate-keys/

Wednesday, March 7, 2012

Identity Columns - Design Question


Is it a good practice to use identty columns as primary keys? or should
we generate our own keys using a seed table?
Does using Identity give any performance advantages?
Thank you,Hi

> Is it a good practice to use identty columns as primary keys? or should
> we generate our own keys using a seed table?
If you don't care about gaps you can use an IDENTITY propertry as a
PFRIMARY KEY , just thinking about your business requirements which I don't
know , actuallty there was lots of discussions about this in this newsgpoup

> Does using Identity give any performance advantages?
Yes, it is , especially if it has a CLUSTERED INDEX .
"S Chapman" <s_chapman47@.hotmail.co.uk> wrote in message
news:1149591940.115554.74460@.y43g2000cwc.googlegroups.com...
>
> Is it a good practice to use identty columns as primary keys? or should
> we generate our own keys using a seed table?
> Does using Identity give any performance advantages?
> Thank you,
>|||There is a slight performance gain on large tables.
But I don't use them anymore due to re-design issues on upgrades and third
party ware and a number of other integrations.
"Uri Dimant" wrote:

> Hi
>
> If you don't care about gaps you can use an IDENTITY propertry as a
> PFRIMARY KEY , just thinking about your business requirements which I don
't
> know , actuallty there was lots of discussions about this in this newsgpo
up
>
>
> Yes, it is , especially if it has a CLUSTERED INDEX .
>
>
> "S Chapman" <s_chapman47@.hotmail.co.uk> wrote in message
> news:1149591940.115554.74460@.y43g2000cwc.googlegroups.com...
>
>|||This is an altered version of an older response on the same subject, with
emphasis on fundamentals:
A key is an attribute or set of attributes that can uniquely identify an
entity in the conceptual model. At the corresponding logical level, a column
or set of columns that can uniquely identify a row in a table is defined as
a key. In reality, an entity may have more than one candidate that provides
such identification ( hence are called candidate keys ), however due to
obvious reasons one of those candidate keys should be treated as primal (
hence called primary key ) based on practical considerations. Such
considerations should include:
i. Familiarity - meaningful to the user.
ii. Stability - non-volatile, should not be altered frequently.
iii. Simplicity - so that queries are easy to express and optimize.
iv. Irreducibility - no proper subset of the key should uniquely identify a
row in the table. ( some treat this also as a derivation to 2NF, which
prohibits partial key dependencies )
Having stated the above guidelines, regardless of the complexity in your
business model & data requirements, an intelligent database design can find
a middle ground by trading off certain characteristics without compromising
the integrity of the data. And one such compromise is the logical surrogate
key which usually defies the characteristic of familiarity but provides
excellent stability and simplicity. Simple and stable keys are often
recommended over composite/volatile keys esp. when they are used in
referential integrity constraints.
http://www.datraverse.com/technology/sql.php
One of the fundamental principles guiding relational design, is a concept
called physical data independence, which can be simply put as: the users
must be presented with a logical view of the data & need not deal with the
physical implementation and storage details. That is primarily why we deal
with tables, columns, keys etc ( which is logical) instead of files, disk
indices, row positions etc (which is physical). Thus you will see why and
how a surrogate in a clean logical model, should also conform to physical
data independence.
Features like identity, GUID etc provide uniqueness, but there are questions
regarding whether they provide sufficient data independence esp. with
ordering of identity column values at the time of generation. Some aspects
of this issue were addressed here:
http://blogs.msdn.com/sqltips/default.aspx?p=2
The non-updateability of identity columns arguably invalidates relational
assignments and therefore the information principle. Since the internal
mechanism that generates identity values are mostly unknown to the user,
practical data design and quality guidelines involving data verification and
validation cannot be applied to such DBMS generated values either.
At the logical level, performance should be least of your considerations in
key selection, since the primary goal of keys is entity identification that
preserves data integrity. Performance is purely dependent upon the physical
implementation of a database. Obviously values generated and tied closely
and directly to the physical model tend to be better performing. Also values
that are smaller in physical size can improve certain query performance due
to faster disk access i/o etc. Given that identity values are of numeric
datatypes and often smaller in physical size, queries utilizing identity
columns tend to have the same performance benefits.
Anith|||I have read ("i have never tried it out myself") -- Identity values cannot
be used in merge replication.
Best Regards
Vadivel
http://vadivel.blogspot.com
"S Chapman" wrote:

>
> Is it a good practice to use identty columns as primary keys? or should
> we generate our own keys using a seed table?
> Does using Identity give any performance advantages?
> Thank you,
>|||If you have no natural primary key then using the IDENTITY property is
significantly better than roling your own because you won't get the same
contention (blocking and serialisation) you get with other roled methods.
IDENTITY does give a performance advantage - there is no locking to get the
next number.
The one divantage is that you can get gaps because the calculation for
next number is not transaction aware but that doesn't matter in most cases
i've seen in business.
You really need to PRIMARY KEY using your natural key and then use a
surrogate key (the IDENTIITY) everywhere else - in sps, joins etc... the
primary key is then just meta data.
eg.
create table sector (
sector_name nvarchar(100) not null constraint pk_sector primary
key nonclustered,
id int not null constraint sk_sector unique clustered
)
create table sector_coverage (
sector_id int not null references sector( id ),
individual_id int not null references individual( id )
)
Tony.
Tony Rogerson
SQL Server MVP
http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a SQL
Server Consultant
http://sqlserverfaq.com - free video tutorials
"S Chapman" <s_chapman47@.hotmail.co.uk> wrote in message
news:1149591940.115554.74460@.y43g2000cwc.googlegroups.com...
>
> Is it a good practice to use identty columns as primary keys? or should
> we generate our own keys using a seed table?
> Does using Identity give any performance advantages?
> Thank you,
>|||Agreed Tony
"Tony Rogerson" wrote:

> If you have no natural primary key then using the IDENTITY property is
> significantly better than roling your own because you won't get the same
> contention (blocking and serialisation) you get with other roled methods.
> IDENTITY does give a performance advantage - there is no locking to get th
e
> next number.
> The one divantage is that you can get gaps because the calculation for
> next number is not transaction aware but that doesn't matter in most cases
> i've seen in business.
> You really need to PRIMARY KEY using your natural key and then use a
> surrogate key (the IDENTIITY) everywhere else - in sps, joins etc... the
> primary key is then just meta data.
> eg.
> create table sector (
> sector_name nvarchar(100) not null constraint pk_sector primary
> key nonclustered,
> id int not null constraint sk_sector unique clustered
> )
> create table sector_coverage (
> sector_id int not null references sector( id ),
> individual_id int not null references individual( id )
> )
> Tony.
> --
> Tony Rogerson
> SQL Server MVP
> http://sqlblogcasts.com/blogs/tonyrogerson - technical commentary from a S
QL
> Server Consultant
> http://sqlserverfaq.com - free video tutorials
>
> "S Chapman" <s_chapman47@.hotmail.co.uk> wrote in message
> news:1149591940.115554.74460@.y43g2000cwc.googlegroups.com...
>
>

Friday, February 24, 2012

identity column

Hi,
I would like a piece of advice.
I have 3 foreign keys in a table used as primary keys for this table.
Is it useful in that case to have just one identity column that would
be used as the unique primary key, thus no need to have 3 primary keys
?
Advantages vs Drabacks ?

RegardsSam (samuel.berthelot@.voila.fr) writes:
> I would like a piece of advice.
> I have 3 foreign keys in a table used as primary keys for this table.
> Is it useful in that case to have just one identity column that would
> be used as the unique primary key, thus no need to have 3 primary keys
> ?
> Advantages vs Drabacks ?

Sometimes this makes things easier. Sometimes it makes things more
difficult.

The advantage with the IDENTITY column are more directly apparent with
less typing and all that. The disadvantages are more subtle but once you
run into them, you often find that they are more serious.

Thus, unless there is a special reason for it, do not use a surrogate key.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Sam, many developers seem to follow the dogma that every table must
have a unique numeric primary key (PK). I do not. The question comes
down to this: will the application or end users ever use the generated
PK? If no, then why add it? In this case there is no point.

On the other hand you need to consider if this table will have child
rows and how they will relate back to the table in question. You also
have to consider if any of the PK columns are subject to value changes
and if so how such changes will be handled.

So in my opinion the choice on how to construct the PK comes down to
how will the data be accessed and updated? Does the chosen PK meet all
the application requirements?

HTH -- Mark D Powell --|||Maybe it's clearer with my tables. Here are the ones concerned with
only the relevant fields:

User(UserId)
Web(WebId)
Query(QueryId)
UserQueries(UserId, WebId, QueryId)

UserQueries is the trouble maker. Right now (UserId, WebId, QueryId)
are FK and PK at the same timefor this table. They must be FK anyway,
but wouldn't it be better to have something like:

UserQueries(UserQueriesId, UserId, WebId, QueryId)

with userQueriesId being Identity type ?|||I guess you mean that your table has a composite primary key made up of
three columns, each of which is a foreign key on another table? If
that's the correct primary key according to your data model, and you
don't have any serious performance issues, then it's probably best not
to change anything.

You might use IDENTITY as an artificial key either because there is no
natural key for the table, or because the natural key is so wide
physically that it creates real performance issues. Even if you did
decide to use IDENTITY instead of the natural key, you would still keep
a UNIQUE constraint on the natural key to make sure you don't get any
duplicates. And then you would be improving performance at the cost of
more maintenance, so it's probably better not to do it unless you
really have to.

Simon|||Sam (samuel.berthelot@.voila.fr) writes:
> Maybe it's clearer with my tables. Here are the ones concerned with
> only the relevant fields:
> User(UserId)
> Web(WebId)
> Query(QueryId)
> UserQueries(UserId, WebId, QueryId)
> UserQueries is the trouble maker. Right now (UserId, WebId, QueryId)
> are FK and PK at the same timefor this table. They must be FK anyway,
> but wouldn't it be better to have something like:
> UserQueries(UserQueriesId, UserId, WebId, QueryId)
> with userQueriesId being Identity type ?

It could be, but I say that the odds are poor. Unless you really some
use which mandates a surrogate key, there is no reason to have any
UserQueriesId.

If you add an IDENTITY column you should still create a UNIQUE constraint
for (UserId, WebId, QueryId).

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Simon Hayes (sql@.hayes.ch) writes:
> You might use IDENTITY as an artificial key either because there is no
> natural key for the table, or because the natural key is so wide
> physically that it creates real performance issues.

Just to add to this: Adding the IDENTITY column could in fact
decrease your performance as well, as the table would be larger. If
there is a child table, that could use this identity column as its
FK, then that table could indeed be smaller. Then again, you could then
find that more operations on the child would require joining to the
parent table, in which case you are again losing in the IDENTITY column.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||Some real DDL will help. Never use an IDENTITY as a key, sicne by
definition it is not an attribute of the entity being modeled.
Basically it says that you don't know RM and are going back to using a
physical locator, like pointer chains, to navigate yoru data. You have
to have the relational keys anyway; if you don't, you will get
redundant duplicates. Do you really mean to destroy all your data
integrity? How do you know that this "magivalk one size fits all"
number is actually in synch everywhere it is used? You don't. So all
that IDENTITY does is make the physical storage used bigger and
therefore slower. Your code will not port. Etc.

CREATE TABLE Users
(user_id INTEGER NOT NULL PRIMARY KEY,
...);

CREATE TABLE Web
(web_id INTEGER NOT NULL PRIMARY KEY,
..);

CREATE TABLE Queries
(query_id INTEGER NOT NULL PRIMARY KEY,
..) ;

Now throw in some DRI actions and let the schema take care of itself
for you

UserQueries
(user_id INTEGER NOT NULL
REFERENCES Users(user_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
web_id INTEGER NOT NULL
REFERENCES Web(web_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
query_id INTEGER NOT NULL,
REFERENCES Queries(query_id)
ON DELETE CASCADE
ON UPDATE CASCADE,
PRIMARY KEY (user_id, web_id, query_id));

You might want to read a book basic relational theory and look up what
Dr. Codd said about system generatee surrogates. The fact that
IDENTITY is not a key is a matter of definition, not opinion.