We are wanting to model standards for the equipment. But for each
equipment, there may be a completely different set of characteristics
that make up its standard. There's more to it than just this, but
let's keep it simple here. What would be clear and obvious to anyone
who's done even a basic Access database would be to have all of these
characteristics (and I mean all of them across all the different
equipment) as columns in one huge standards table. But wait - we
could have hundreds of characteristics, and most importantly, the users
will (really, just accept this fact PLEASE) want to change or at least
add more characteristics over time. This option is a nightmare and we
(me and the project manager overseeing this) agree it as such. So
what's flexible?
I came up with one and know it will work and is normalized and very
relational, BUT it's a slight bear getting the data out and once
it's out it cannot be easily joined with anything (you will see why
in a bit).
A three table structure - Standards, StandardCharacteristics and
StandardValues. Visual picture available here -->
http://www.flickr.com/photos/ccjjharmon/85629981/
"Standards" holds the standards records that change over time, but
these records don't hold ANY of the values - they are just the
placeholder.
"StandardCharacteristics" holds the names and units for the
characteristics and can be associated to multiple pieces of equipment
(separate table to encapsulate this - not shown though).
"StandardValues" holds the meat (and by far the most records) by
containing the characteristic and value pair per record. A set of these
records makes up the standards when joined with the Standards record
associated.
So this does work for data storage - but getting the data out is a
little challenging. A crosstab(based on this article:
http://www.windowsitpro.com/SQLServ...5608/15608.html)
was used to flip StandardValues rows into columns in addition to the
Standards columns.
OK - so that's my design. What other options are out there? Any
thoughts on my design? Is it good? Bad?Look up all the problems with EAV (Entity-Attribute-Value) designs. I
worked for a company that did this all the time; it falls apart in a
year of production work, eats up storage and the simplest queries
become nightmares.
Also, you might want to learn the diffrences between rows and records.
It will help you to think in a proper data model.|||Is this SQL 2005?
If so, it sounds like a great candidate for storing part of your data in XML
.
If it is not 2005, I wouldn't recommend using XML. But, in 2005 the support
is solid, and the use case of when to use it is for cases when you have a
variable set of attributes for each row. It seems that storing your
standards as XML would be worth exploring.
HTH
--
Ryan Powers
Clarity Consulting
http://www.claritycon.com
" ccjjharmonDOTREMOVE@.dotRemovedotgmaildot
" wrote:
> We are wanting to model standards for the equipment. But for each
> equipment, there may be a completely different set of characteristics
> that make up its standard. There's more to it than just this, but
> let's keep it simple here. What would be clear and obvious to anyone
> who's done even a basic Access database would be to have all of these
> characteristics (and I mean all of them across all the different
> equipment) as columns in one huge standards table. But wait - we
> could have hundreds of characteristics, and most importantly, the users
> will (really, just accept this fact PLEASE) want to change or at least
> add more characteristics over time. This option is a nightmare and we
> (me and the project manager overseeing this) agree it as such. So
> what's flexible?
> I came up with one and know it will work and is normalized and very
> relational, BUT it's a slight bear getting the data out and once
> it's out it cannot be easily joined with anything (you will see why
> in a bit).
> A three table structure - Standards, StandardCharacteristics and
> StandardValues. Visual picture available here -->
> http://www.flickr.com/photos/ccjjharmon/85629981/
> "Standards" holds the standards records that change over time, but
> these records don't hold ANY of the values - they are just the
> placeholder.
> "StandardCharacteristics" holds the names and units for the
> characteristics and can be associated to multiple pieces of equipment
> (separate table to encapsulate this - not shown though).
> "StandardValues" holds the meat (and by far the most records) by
> containing the characteristic and value pair per record. A set of these
> records makes up the standards when joined with the Standards record
> associated.
> So this does work for data storage - but getting the data out is a
> little challenging. A crosstab(based on this article:
> http://www.windowsitpro.com/SQLServ...5608/15608.html)
> was used to flip StandardValues rows into columns in addition to the
> Standards columns.
> OK - so that's my design. What other options are out there? Any
> thoughts on my design? Is it good? Bad?
>|||It's a well know approach. I've seen it being implemened many times. To
my best knowledge, for reasonable amounts of data it's always a total
and miserable failure - the performance is horrible. You can get a
great explanation why in one of Tom Kyte's books.
I would set up a regular table with regular columns for the attributes
all the models have.
For the flexible part, I would first of all consider storing them as
XML.|||* Thanks for the responses.
I will look into EAV --> any particular references you could point out
to refer to?
What is the difference between a row and a record? I would imagine for
my scenario that a complete "Standard" row consists of one Standards
record along with multiple StandardValues records (one for every
recorded StandardCharacteristic). Is there more?
It *could* be in SQL Server 2005 possibly. We did discuss using XML, as
we attended the SQL Server 2005 Launch locally and even talked about
this with a MS expert. We didn't like it however because we do
reference the StandardCharacteristics with other data tables.
What I haven't mentioned (because I hadn't thought of it, but my
product manager mentioned a few minutes ago) was that reporting on this
data is the most important piece to the database model design.
The standards are only one piece of the reporting puzzle - we also are
recording the actual values occurring on the floor, and we need to
report on the differences between these values and the associated
standards. Wouldn't the XML data storage idea be bad for this?
Lastly, using one big regular table with all the attributes as columns
I said was NOT what we wanted to do and is explained in my initial
post.
I will look into Tom Kyte and his books.
So it appears I came up with this EAV design, but it looks like *no
one* recommends it. When if at all should this design be used?
And if never, what is better other than one big table?|||ccjjharmonDOTREMOVE@.dotRemovedotgmaildot
com wrote:
> * Thanks for the responses.
> I will look into EAV --> any particular references you could point out
> to refer to?
You may find the following articles useful:
Generic Design of Web-Based Clinical Databases
http://www.jmir.org/2003/4/e27/
The EAV/CR Model of Data Representation
http://ycmi.med.yale.edu/nadkarni/eav_CR_contents.htm
An Introduction to Entity-Attribute-Value Design for Generic Clinical Study
Data Management Systems
http://ycmi.med.yale.edu/nadkarni/I...V%20systems.htm
Data Extraction and Ad Hoc Query of an Entity Attribute Value Database
[url]http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=9824799[/
url]
Exploring Performance Issues for a Clinical Database Organized Using an Enti
ty-Attribute-Value Representation
[url]http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pubmed&pubmedid=10984467[
/url]|||Excellent links. I believe you (and the medical industry in general)
are the only one so far who says EAV has a place in real-world database
modeling. Does anyone (or industry) else use this model?|||ccjjharmonDOTREMOVE@.dotRemovedotgmaildot
com wrote:
> Excellent links. I believe you (and the medical industry in general)
> are the only one so far who says EAV has a place in real-world database
> modeling. Does anyone (or industry) else use this model?
I asked this question about a year ago in this group. If I remember
correctly, everybody here who participated in the discussion was
against the approach, but still didn't give a better solution for
this particular requirement: unknown number of attributes at design
time. Looks like SQL Server (or any relational database) is not
the best instrument to handle this kind of data, but still can be
used to store the data if you design your model carefully.|||On 17 Jan 2006 11:20:48 -0800,
ccjjharmonDOTREMOVE@.dotRemovedotgmaildot
com wrote:
(snip)
>What is the difference between a row and a record?
Hi ccjjharmon,
Better ignore that part of CELKO's message. He is stuck in the 70s when
people used the terms "field" and "record" in sequential files and
non-relational databases, while relational databases used the words
"column" and "row" for the corresponding parts of the relational model.
Joe (Celko, that is) seems to have missed that lots have changed over
the past 30 years, and that many books on relational databases,
including some good books, use these terms interchangeably.
Think of it as an old man shouting "it's called an automobile, young
lad, not a car!"
>What I haven't mentioned (because I hadn't thought of it, but my
>product manager mentioned a few minutes ago) was that reporting on this
>data is the most important piece to the database model design.
All the more reason to not touch EAV with a 10-foot stick. Include 10
columns in your report, and you have a ten-fold self-join. Not fun.
>The standards are only one piece of the reporting puzzle - we also are
>recording the actual values occurring on the floor, and we need to
>report on the differences between these values and the associated
>standards. Wouldn't the XML data storage idea be bad for this?
I haven't had a chance to play with SQL Server 2005's XML options yet.
My guess, though, is that the overhead of XML parsing and data type
conversion will hurt your performance.
>So it appears I came up with this EAV design, but it looks like *no
>one* recommends it. When if at all should this design be used?
I'd only consider it for attributes that are typed in by the user,
stored in the DB (without ANY checking!!) and displayed on screen when
the user displays the same row again. And even in those cases, I'd have
a strong desire to use a better design.
>Lastly, using one big regular table with all the attributes as columns
>I said was NOT what we wanted to do and is explained in my initial
>post.
(...)
>And if never, what is better other than one big table?
Two slightly smaller tables? :-))
Serious though. I agree with you that one big table is probably a bad
design. Instead, you should sit down with a subject matter expert, find
out the functional dependencies between the attributes, then normalize
to at least third normal form. If, after that, you still have one big
table, yoou can safely conclude that one big table IS the best option.
I know it's not the answer you want to hear, but I don't think there are
any better answers.
Hugo Kornelis, SQL Server MVP|||Sericinus - you said you were looking at using this EAV model a year
ago - what did you end up using for a model?
Hugo - thanks for the detailed response. We had worked around the
self-joining through a crosstab stored procedure - and found the
performance to appear to be OK. With all of these comments however, I
honestly am seriously questioning using this model even after my
testing found it should be able to work. If I didn't point it out
before though, the only to get to the data in what would appear to be a
complete Standards table (with the values and characteristics
cross-tabbed) is via this crosstab stored procedure - and then we can't
reference in a SELECT statement (e.g. SELECT * FROM uspStandards), can
we? No. This is a hindrance.
I still want to push back further before making the decision... and
towards that - what I don't specifically understand is the drawbacks
that are listed for EAV and how they might apply to our usage of this
model. We are asking experts in the area for their feedback on this
specific need and usage of this model. At that point, we would probably
try to "find out the functional dependencies between the attributes,
then normalize to at least third normal form" -- whatever that means.
Hugo - can you put this into more understandable language? I know SQL
fairly well, but am not up on that side of the theory and syntax (Nth
normal form).
We were discussing these comments and what seems possible as another
data model is to use a single large [flattened] table with a set of
characteristic columns (A, B, C, D, etc.) and have within some table a
"decoder" that maps these columns to what they actually are. Does
anyone know well the XML capabilities in SQL 05?
Chris Harmon
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment