Thursday, March 29, 2012

Data Structures

(Wow; surfing this site has really illuminated what a lowly hack-programmer I am to this field of SQL and relational processing :S )

I am creating a temp table, doing an Bill of Material explosion for a single Order Line Item.

(note: SQL Server 2000)

I used blindman's "accumulator method (click here) (http://www.dbforums.com/showpost.php?p=6239109&postcount=6)" to generate the entire potential BOM tree.

So; step 1 works wonderfully! :beer: Cheers blindman!

Now; I want to remove unwanted nodes (and all their children) from the temp-table. I have a bunch of functions (0 to 1 for each node) that return a Yes or No. If "No", then flag-to-eliminate immediately that branch and don't revisit anything on it (and therein lies my problem). This will leave me with a temp table of only valid nodes.

Since this is recursive, and since it will involve Dynamic SQL (the function name varies), all I can come up with is using a Cursor in a WHILE loop. Even at that; since a CURSOR is point-in-time (ie: values don't change once selected), I'll have to re-check the current temp table values (or create of 2nd temp table of only deleted nodes and repeat a SELECT with NOT EXISTS in it, hmmm).

Since blindman's method generates a table ordered by level, the sequence of processing is pre-determined - unless I can re-order it into a more traditional hierarchy (1 entire branch at a time) and number the levels, in which case the cursor could just skip to the next branch of equal or higher level.

Note: The thought does occur to me I could have an intermediary function (static name) that in turn does the Dynamic SQL. These functions contain the Business Logic that looks at a myriad of column values and relationships in the database and there's no one-size-fits-all decision tree so Dynamic SQL is necessary.

The max cursor size will be maybe 300, and on average 100. Number of levels will normally be 3 or 4, but conceivably could be up to 10. Given the average 100 potential components/sub-assemblies, the final assembly will be about 30. As a periodic background process; it will do 3,000 Order Line Items a day, so I'm figuring 1 second response time per build is adequate (ie: the user's not waiting on it so it doesn't have to be blinding fast) - however why waste?

Anyhow; I thought this might be a fun problem for some Data Structure genius who wants to give a lesson in Relational Programming.

Thanks for looking.

Here's what I have so far:

CREATE PROCEDURE dbo.sp_ExplodeTest1 (@.recID int = 1) AS
/* tbTestH is a table containing an assembly hierarchy.
Assemblies with no parent are Builds.
Assemblies with no children are Components.
It's columns:
MyID int,
ParentID int,
(other descriptive columns)
*/

declare @.t table (
TempNodeID int identity(1,1)
,MyID int
)

-- Seed the tree with the Build's ID.
insert into @.t (MyID) values (@.recID)

/* This populates the temp table with the entire
Assembly for the given build.
It is Ordered By the level in the assembly.
Number of assembly levels is infinite.

For example:
Level 1 = the Build. It comes first
Level 2 = all parents are level 1. Is next (no particular seq)
Level 3 = all parents are level 2. Is next (no particular seq)
etc.

*/
while @.@.Rowcount > 0
insert into @.t (MyID)
select tbTestH.MyID
from tbTestH
inner join @.t rList on tbTestH.ParentID = rList.MyID
where not exists (
select *
from @.t CurrentList
where CurrentList.MyID = tbTestH.MyID)

-- now to display the results so far.
select t.*, tbTestH.*
from tbTestH inner join @.t t on tbTestH.MyID = t.MyID
order by t.TempNodeID

GOOk, well you base code looks fine. But I don't understand what you are trying to do next.

Do you want to use one or more functions to eliminate a branch and all of its children from the dataset you have created? Why not just embed the function in the code you use to populate the table, filtering out all and excluding all those that fail your function tests?

And thanks for the beer.|||Ok, well you base code looks fine. But I don't understand what you are trying to do next.

Do you want to use one or more functions to eliminate a branch and all of its children from the dataset you have created? Why not just embed the function in the code you use to populate the table, filtering out all and excluding all those that fail your function tests?

And thanks for the beer.
You're welcome and well deserved.

I'm not sure about the ultimate possibilities, but if I'm forced with the cursor approach then you are right, I could just skip them. Deleting from the list (or flagging for deletion, since that would have less overhead) only has value if there will be multiple passes, particularly using cursors. In other words, I don't want to repeat work.

I'm realizing that if I have 3000 Order Line Items to process, that I may be able to do them all in one swoop (attaching the OrderLineItemID to each row of course).

Each node of the final build will have a quantity and size that also require calculation. I've defined 15 different methods for calculating quantity and size (example: same as height, same as width, xref-lookup based on height, xref-lookup based on width, xref-lookup based on height/width combo, hard coded at 1, etc.). As well; each component translates (either directly or via a color-xref lookup) into an actual Product ID.

In cases where the node is not a component (ie: has children), it's quantity is used as a multiplier for it's branch. Example; if there's 4 "ladder assemblies", and each ladder has 2 plastic plugs, then the "plastic plug" quantity will be 8. Therefore; the heirchy probably needs to be traversed in sequence - unless I can think of a fancy single update with group-by statement that zooms it all together.

I'm afraid my brain is incapable of looking at the problem and knowing just how to best solve it. I plan to use this remainder of this week to work that out. If someone here is inclined to help out, I'll be most grateful (hey; I'll hand out a whole page of beers for an elegant solution).

Not sure if I've given enough info for the next step. I guess that would be either
1. Ordering this into hierarchical sequence and assigning level numbers
2. Ordering this into hierarchical sequence and assigning high-low node numbers.
Hopefully; with a nice single-SQL update statement (or series of statements like Blindman's WHILE ... INSERT).

My goal isn't the most elegant solution in the world, but one that'll work pretty well for the problem at hand (BOM, MRP, Forcasting system for 35m/yr custom build-to-order manufacturer). By "well", I mean modestly fast, scalable, maintainable, modular, flexible, and accurate.

Thanks!|||So you want to run different function tests depending upon whether the node has children or not?
I still do not see why you think you need a cursor for this.|||So you want to run different function tests depending upon whether the node has children or not?
I still do not see why you think you need a cursor for this.

I believe it's not 1 level of children.....|||So you want to run different function tests depending upon whether the node has children or not?
I still do not see why you think you need a cursor for this.
Each node, regardless of it's status (Tree, Branch, or Leaf) may be either:
1. Always present, given that it's parent is present.
or
2. Optionally present, dependent upon the Y/N result of a custom function.

The fuction for each node is custom only to that node.

Example:

CREATE FUNCTION fn_YN_10 (parms...)
RETURNS bit AS
BEGIN
DECLARE @.MyAnswer bit
SET @.MyAnswer = 1
IF (SELECT MySpecialFeature from tbLineItem where LineItemID = @.parmLineItemID) = "S"
SET @.MyAnswer = 0

RETURN @.MyAnswer

END

The name "fn_YN_10" is a value on the tbAssy. How can I execute that except within a Cursor? Please illuminate?

Also; I need to skip the entire branch if the answer is 1 (ie: No). One method to do that is have the Cursor order everything in hierarchical sequence(everything for a branch together) and process it in traditional loop fashion and with the levels numbered (tree = 1, next branch = 2, leafs = some higher number). If the answer is 1 (false) then it skips everything until the next node at the same or a lower level.

In skipping a branch; I could possibly add to a 2nd table of "Excluded Branches" and have a select with "NOT EXISTS". I believe there's a way to do that HOWEVER; if I have to use a cursor to get the Dynamic SQL anyway, then the first approach makes the most sense.

Note: If I do process 3000 Line items simultaneously, with avg 100 nodes each, we're talking 300,000 nodes. Now it's time to address performance. If I can avoid a Dynamic SQL Cursor, then by all means, that's the way to go. Not sure if doing all at once will make things faster or just cause indexing problems.

One possible solution: I could create an intermediate fuction to which the Function Name (and applicable parms) is passed. That would have a static name. It could then do the Dynamic SQL. That would avoid forcing the processing to use a CURSOR just because of the need for Dynamic SQL. But then it's back to how I can skip a branch without doing a Cursor Loop.|||I believe it's not 1 level of children.....
Correct:

Level 1 = tree
Level 2 or higher = branch
Level 2 or higher = leaf

Children always have higher number so by definition cannot be 1.|||but are we talking about children, grand children, great grandchildren, ect|||but are we talking about children, grand children, great grandchildren, ect
Yes.

Example of the tree representation:
Tree = no parent
Branch = has parent and child
Leaf = no child

01 (tree's root - level 1)
..02 (leaf - level 2)
..03 (branch - level 2)
...04 (branch - level 3)
.....05 (leaf - level 4)
......06 (leaf - level 5)
..07 (branch - level 2)
...08 (branch - level 3)
.....09 (leaf - level 4)
.....09 (leaf - level 4)
..10 (leaf - level 2)
..11 (branch - level 2)
...12 (leaf - level 3)

The child, father, grandfather, great grandfather traverses in the other direction and as such, has limitations
01 (great grandfather)
..02 (child) oops, this doesn't really work ...
..03 (grandfather)
...04 (father)
.....05 (child) oops, this doesn't work either
......06 (child) grandchild? doesn't fit.
etc

So; Child, Father, Grandfather notation works when referencing the lineage of a single elementry component, but not for representing a jagged hierarchy from it's highest level on down.

I see where child needs to be level 01 and so on using this notation method, ergo the confusion.

My bad for using the Child/Parent notation. I should have stuck with the Tree notation. Sorry about the confusion.

No comments:

Post a Comment