home *** CD-ROM | disk | FTP | other *** search
- Path: senator-bedfellow.mit.edu!dreaderd!not-for-mail
- Message-ID: <databases/sybase-faq/part12_1082468590@rtfm.mit.edu>
- Supersedes: <databases/sybase-faq/part12_1074677126@rtfm.mit.edu>
- Expires: 2 Aug 2004 13:43:10 GMT
- References: <databases/sybase-faq/part1_1082468590@rtfm.mit.edu>
- X-Last-Updated: 2003/03/02
- From: dowen@midsomer.org (David Owen)
- Newsgroups: comp.databases.sybase,comp.answers,news.answers
- Subject: Sybase FAQ: 12/19 - ASE SQL (1 of 3)
- Reply-To: dowen@midsomer.org (David Owen)
- Followup-To: comp.databases.sybase
- Distribution: world
- Organization: Midsomer Consultants Inc.
- Approved: news-answers-request@MIT.EDU
- Keywords: FAQ, DATABASES, SYBASE, ASA, ASE, REP
- Originator: faqserv@penguin-lust.MIT.EDU
- Date: 20 Apr 2004 13:45:11 GMT
- Lines: 666
- NNTP-Posting-Host: penguin-lust.mit.edu
- X-Trace: 1082468711 senator-bedfellow.mit.edu 574 18.181.0.29
- Xref: senator-bedfellow.mit.edu comp.databases.sybase:106210 comp.answers:56956 news.answers:270296
-
- Archive-name: databases/sybase-faq/part12
- URL: http://www.isug.com/Sybase_FAQ
- Version: 1.7
- Maintainer: David Owen
- Last-modified: 2003/03/02
- Posting-Frequency: posted every 3rd month
- A how-to-find-the-FAQ article is posted on the intervening months.
-
- SQL Fundamentals
-
-
-
- 6.1.1 Are there alternatives to row at a time processing?
- 6.1.2 When should I execute an sp_recompile?
- 6.1.3 What are the different types of locks and what do they mean?
- 6.1.4 What's the purpose of using holdlock?
- 6.1.5 What's the difference between an update in place versus a deferred
- update? - see Q1.5.9
- 6.1.6 How do I find the oldest open transaction?
- 6.1.7 How do I check if log truncation is blocked?
- 6.1.8 The timestamp datatype
- 6.1.9 Stored Procedure Recompilation and Reresolution
- 6.1.10 How do I manipulate binary columns?
- 6.1.11 How do I remove duplicate rows from a table?
-
-
-
- SQL Advanced bcp ASE FAQ
-
- -------------------------------------------------------------------------------
-
- 6.1.1: Alternative to row at a time processing
-
- -------------------------------------------------------------------------------
-
- Someone asked how they could speed up their processing. They were batch
- updating/inserting gobs of information. Their algorithm was something as
- follows:
-
- ... In another case I do:
-
- If exists (select record) then
- update record
- else
- insert record
-
- I'm not sure which way is faster or if it makes a difference. I am doing
- this for as many as 4000 records at a time (calling a stored procedure 4000
- times!). I am interesting in knowing any way to improve this. The parameter
- translation alone on the procedure calls takes 40 seconds for 4000 records.
- I am using exec in DB-Lib.
-
- Would RPC or CT-Lib be better/faster?
-
- A netter responded stating that it was faster to ditch their algorithm and to
- apply a set based strategy:
-
- The way to take your approach is to convert the row at a time processing
- (which is more traditional type of thinking) into a batch at a time (which
- is more relational type of thinking). Now I'm not trying to insult you to
- say that you suck or anything like that, we just need to dial you in to
- think in relational terms.
-
- The idea is to do batches (or bundles) of rows rather than processing a
- single one at a time.
-
- So let's take your example (since you didn't give exact values [probably
- out of kindness to save my eyeballs] I'll use your generic example to
- extend what I'm talking about):
-
- Before:
-
- if exists (select record) then
- update record
- else
- insert record
-
- New way:
- 1. Load all your rows into a table named new_stuff in a separate work
- database (call it work_db) and load it using bcp -- no third GL needed.
- 1. truncate new_stuff and drop all indexes
- 2. sort your data using UNIX sort and sort it by the clustered columns
- 3. load it using bcp
- 4. create clustered index using with sorted_data and any ancillary
- non-clustered index.
- 2. Assuming that your target table is called old_stuff
- 3. Do the update in a single batch:
- begin tran
-
- /* delete any rows in old_stuff which would normally
- ** would have been updated... we'll insert 'em instead!
- ** Essentially, treat the update as a delete/insert.
- */
-
- delete old_stuff
- from old_stuff,
- new_stuff
- where old_stuff.key = new_stuff.key
-
- /* insert entire new table: this adds any rows
- ** that would have been updated before and
- ** inserts the new rows
- */
- insert old_stuff
- select * from new_stuff
-
- commit tran
-
-
- You can do all this without writing 3-GL, using bcp and a shell script.
-
- A word of caution:
-
- Since these inserts/updates are batched orientated you may blow your
- log if you attempt to do too many at a time. In order to avoid this use
- the set rowcount directive to create bite-size chunks.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.2: When should I execute an sp_recompile?
-
- -------------------------------------------------------------------------------
-
- An sp_recompile should be issued any time a new index is added or an update
- statistics. Dropping an index will cause an automatic recompile of all objects
- that are dependent on the table.
-
- The sp_recompile command simply increments the schemacnt counter for the given
- table. All dependent object counter's are checked against this counter and if
- they are different the SQL Server recompiles the object.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.3: What are the different types of (All Page) locks?
-
- -------------------------------------------------------------------------------
-
- First off, just to get it out of the way, Sybase does now support row level
- locking! (See Q6.1.11 for a description of the new features.) OK, that said and
- sone, if you think you need row level locking, you probably aren't thinking set
- based -- see Q6.1.1 for set processing.
-
- The SQL Server uses locking in order to ensure that sanity of your queries.
- Without locking there is no way to ensure the integrity of your operation.
- Imagine a transaction that debited one account and credited another. If the
- transaction didn't lock out readers/writers then someone can potentially see
- erroneous data.
-
- Essentially, the SQL Server attempts to use the least intrusive lock possible,
- page lock, to satisfy a request. If it reaches around 200 page locks, then it
- escalates the lock to a table lock and releases all page locks thus performing
- the task more efficiently.
-
- There are three types of locks:
-
- * page locks
- * table locks
- * demand locks
-
- Page Locks
-
- There are three types of page locks:
-
- * shared
- * exclusive
- * update
-
- shared
-
- These locks are requested and used by readers of information. More than one
- connection can hold a shared lock on a data page.
-
- This allows for multiple readers.
-
- exclusive
-
- The SQL Server uses exclusive locks when data is to be modified. Only one
- connection may have an exclusive lock on a given data page. If a table is large
- enough and the data is spread sufficiently, more than one connection may update
- different data pages of a given table simultaneously.
-
- update
-
- A update lock is placed during a delete or an update while the SQL Server is
- hunting for the pages to be altered. While an update lock is in place, there
- can be shared locks thus allowing for higher throughput.
-
- The update lock(s) are promoted to exclusive locks once the SQL Server is ready
- to perform the delete/update.
-
- Table Locks
-
- There are three types of table locks:
-
- * intent
- * shared
- * exclusive
-
- intent
-
- Intent locks indicate the intention to acquire a shared or exclusive lock on a
- data page. Intent locks are used to prevent other transactions from acquiring
- shared or exclusive locks on the given page.
-
- shared
-
- This is similar to a page level shared lock but it affects the entire table.
- This lock is typically applied during the creation of a non-clustered index.
-
- exclusive
-
- This is similar to a page level exclusive lock but it affects the entire table.
- If an update or delete affects the entire table, an exclusive table lock is
- generated. Also, during the creation of a clustered index an exclusive lock is
- generated.
-
- Demand Locks
-
- A demand lock prevents further shared locks from being set. The SQL Server sets
- a demand lock to indicate that a transaction is next to lock a table or a page.
-
- This avoids indefinite postponement if there was a flurry of readers when a
- writer wished to make a change.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.4: What's the purpose of using holdlock?
-
- -------------------------------------------------------------------------------
-
- All select/readtext statements acquire shared locks (see Q6.1.3) to retrieve
- their information. After the information is retrieved, the shared lock(s) is/
- are released.
-
- The holdlock option is used within transactions so that after the select/
- readtext statement the locks are held until the end of the transaction:
-
- * commit transaction
- * rollback transaction
-
- If the holdlock is not used within a transaction, the shared locks are
- released.
-
- Example
-
- Assume we have the following two transactions and that each where-clause
- qualifies a single row:
-
- tx #1
-
- begin transaction
- /* acquire a shared lock and hold it until we commit */
- 1: select col_1 from table_a holdlock where id=1
- 2: update table_b set col_3 = 'fiz' where id=12
- commit transaction
-
- tx #2
-
- begin transaction
- 1: update table_a set col_2 = 'a' where id=1
- 2: update table_c set col_3 = 'teo' where id=45
- commit transaction
-
- If tx#1, line 1 executes prior to tx#2, line 1, tx#2 waits to acquire its
- exclusive lock until tx#1 releases the shared level lock on the object. This
- will not be done until the commit transaction, thus slowing user throughput.
-
- On the other hand, if tx#1 had not used the holdlock attribute, tx#2 would not
- have had to wait until tx#1 committed its transaction. This is because shared
- level locks are released immediately (even within transactions) when the
- holdlock attribute is not used.
-
- Note that the holdlock attribute does not stop another transaction from
- acquiring a shared level lock on the object (i.e. another reader). It only
- stops an exclusive level lock (i.e. a writer) from being acquired.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.6: How do I find the oldest open transaction?
-
- -------------------------------------------------------------------------------
- select h.spid, u.name, p.cmd, h.name, h.starttime,
- p.hostname, p.hostprocess, p.program_name
- from master..syslogshold h,
- master..sysprocesses p,
- master..sysusers u
- where h.spid = p.spid
- and p.suid = u.suid
- and h.spid != 0 /* not replication truncation point */
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.7: How do I check if log truncation is blocked?
-
- -------------------------------------------------------------------------------
-
- System 11 and beyond:
-
- select h.spid, convert(varchar(20), h.name), h.starttime
- from master..syslogshold h,
- sysindexes i
- where h.dbid = db_id()
- and h.spid != 0
- and i.id = 8 /* syslogs */
- and h.page in (i.first, i.first+1) /* first page of log = page of oldest xact */
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.8: The timestamp datatype
-
- -------------------------------------------------------------------------------
-
- The timestamp datatype is user-defined datatype supplied by Sybase, defined as:
-
- varbinary(8) NULL
-
- It has a special use when used to define a table column. A table may have at
- most one column of type timestamp, and whenever a row containing a timestamp
- column is inserted or updated the value in the timestamp column is
- automatically updated. This much is covered in the documentation.
-
- What isn't covered is what the values placed in timestamp columns actually
- represent. It is a common misconception that timestamp values bear some
- relation to calendar date and/or clock time. They don't - the datatype is
- badly-named. SQL Server keeps a counter that is incremented for every write
- operation - you can see its current value via the global variable @@DBTS
- (though don't try and use this value to predict what will get inserted into a
- timestamp column as every connection shares the same counter.)
-
- The value is maintained between server startups and increases monotonically
- over time (though again you cannot rely on it this behaviour). Eventually the
- value will wrap, potentially causing huge problems, though you will be warned
- before it does - see Sybase Technical News Volume 5, Number 1 (see Q10.3.1).
- You cannot convert this value to a datetime value - it is simply an 8-byte
- integer.
-
- Note that the global timestamp value is used for recovery purposes in the
- event of an RDMBS crash. As transactions are committed to the log each
- transaction gets a unique timestamp value. The checkpoint process places a
- marker in the log with its unique timestamp value. If the RDBMS crashes,
- recovery is the process of looking for transactions that need to be rolled
- forward and/or backward from the checkpoint event. If a transaction spans
- across the checkpoint event and it never competed it too needs to be rolled
- back.
-
- Essentially, this describes the write-ahead log protocol described by C.J.
- Date in An Introduction to Database Systems.
-
- So what is it for? It was created in order to support the browse-mode functions
- of DB-Library (and for recovery as mentioned above). This enables an
- application to easily support optimistic locking (See Q1.5.4) by guaranteeing a
- watch column in a row will change value if any other column in that row is
- updated. The browse functions checked that the timestamp value was still the
- same as when the column was read before attempting an update. This behaviour is
- easy to replicate without necessarily using the actual client browse-mode
- functions - just read the timestamp value along with other data retrieved to
- the client, and compare the stored value with the current value prior to an
- update.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.9: Stored Procedure Recompilation and Reresolution
-
- -------------------------------------------------------------------------------
-
- When a stored procedure is created, the text is placed in syscomments and a
- parse tree is placed in sysprocedures. At this stage there is no compiled query
- plan.
-
- A compiled query plan for the procedure only ever exists in memory (that is, in
- the procedure cache) and is created under the following conditions:
-
- 1. A procedure is executed for the first time.
- 2. A procedure is executed by a second or subsequent user when the first plan
- in cache is still in use.
- 3. The procedure cache is flushed by server restart or cache LRU flush
- procedure.
- 4. The procedure is executed or created using the with recompile option.
-
- If the objects the procedure refers to change in some way - indexes dropped,
- table definition changed, etc - the procedure will be reresolved - which
- updates sysprocedures with a modified tree. Before 10.x the tree grows and in
- extreme cases the procedure can become too big to execute. This problem
- disappears in Sybase System 11. This reresolution will always occur if the
- stored procedure uses temporary tables (tables that start with "#").
-
- There is apparently no way of telling if a procedure has been reresolved.
-
- Traceflag 299 offers some relief, see Q1.3.3 for more information regarding
- traceflags.
-
- The Official Explanation -- Reresolution and Recompilation Explained
-
- When stored procedures are created, an entry is made in sysprocedures that
- contains the query tree for that procedure. This query tree is the resolution
- of the procedure and the applicable objects referenced by it. The syscomments
- table will contain the actual procedure text. No query plan is kept on disk.
- Upon first execution, the query tree is used to create (compile) a query plan
- (execution plan) which is stored in the procedure cache, a server memory
- structure. Additional query plans will be created in cache upon subsequent
- executions of the procedure whenever all existing cached plans are in use. If a
- cached plan is available, it will be used.
-
- Recompilation is the process of using the existing query tree from
- sysprocedures to create (compile) a new plan in cache. Recompilation can be
- triggered by any one of the following:
-
- * First execution of a stored procedure,
- * Subsequent executions of the procedure when all existing cached query plans
- are in use,
- * If the procedure is created with the recompile option, CREATE PROCEDURE
- sproc WITH RECOMPILE
- * If execution is performed with the recompile option, EXECUTE sproc WITH
- RECOMPILE
-
- Re-resolution is the process of updating the query tree in sysprocedures AND
- recompiling the query plan in cache. Re-resolution only updates the query tree
- by adding the new tree onto the existing sysprocedures entry. This process
- causes the procedure to grow in size which will eventually cause an execution
- error (Msg 703 - Memory request failed because more than 64 pages are required
- to run the query in its present form. The query should be broken up into
- shorter queries if possible). Execution of a procedure that has been flagged
- for re-resolution will cause the re-resolution to occur. To reduce the size of
- a procedure, it must be dropped which will remove the entries from
- sysprocedures and syscomments. Then recreate the procedure.
-
- Re-resolution can be triggered by various activities most of which are
- controlled by SQL Server, not the procedure owner. One option is available for
- the procedure owner to force re-resolution. The system procedure, sp_recompile,
- updates the schema count in sysobjects for the table referenced. A DBA usually
- will execute this procedure after creating new distribution pages by use of
- update statistics. The next execution of procedures that reference the table
- flagged by sp_recompile will have a new query tree and query plan created.
- Automatic re-resolution is done by SQL Server in the following scenarios:
-
- * Following a LOAD DATABASE on the database containing the procedure,
- * After a table used by the procedure is dropped and recreated,
- * Following a LOAD DATABASE of a database where a referenced table resides,
- * After a database containing a referenced table is dropped and recreated,
- * Whenever a rule or default is bound or unbound to a referenced table.
-
- Forcing automatic compression of procedures in System 10 is done with trace
- flag 241. System 11 should be doing automatic compression, though this is not
- certain.
-
- When are stored procedures compiled?
-
- Stored procedures are in a database as rows in sysprocedures, in the form of
- parse trees. They are later compiled into execution plans.
-
- A stored procedures is compiled:
-
- 1. with the first EXECute, when the parse tree is read into cache
- 2. with every EXECute, if CREATE PROCEDURE included WITH RECOMPILE
- 3. with each EXECute specifying WITH RECOMPILE
- 4. if the plans in cache for the procedure are all in use by other processes
- 5. after a LOAD DATABASE, when all procedures in the database are recompiled
- 6. if a table referenced by the procedure can not be opened (using object id),
- when recompilation is done using the table's name
- 7. after a schema change in any referenced table, including:
- 1. CREATE INDEX or DROP INDEX to add/delete an index
- 2. ALTER TABLE to add a new column
- 3. sp_bindefault or sp_unbindefault to add/delete a default
- 4. sp_bindrule or sp_unbindrule to add/delete a rule
- 8. after EXECute sp_recompile on a referenced table, which increments
- sysobjects.schema and thus forces re-compilation
-
- What causes re-resolution of a stored procedure?
-
- When a stored procedure references an object that is modified after the
- creation of the stored procedure, the stored procedure must be re-resolved.
- Re-resolution is the process of verifying the location of referenced objects,
- including the object id number. Re-resolution will occur under the following
- circumstances:
-
- 1. One of the tables used by the stored procedure is dropped and re-created.
- 2. A rule or default is bound to one of the tables (or unbound).
- 3. The user runs sp_recompile on one of the tables.
- 4. The database the stored procedure belongs to is re-loaded.
- 5. The database that one of the stored procedure's tables is located in is
- re-loaded.
- 6. The database that one of the stored procedure's tables is located in is
- dropped and re-created.
-
- What will cause the size of a stored procedure to grow?
-
- Any of the following will result in a stored procedure to grow when it is
- recompiled:
-
- 1. One of the tables used in the procedure is dropped and re-created.
- 2. A new rule or default is bound to one of the tables or the user runs
- sp_recompile on one of the tables.
- 3. The database containing the stored procedure is re-loaded.
-
- Other things causing a stored procedure to be re-compiled will not cause it to
- grow. For example, dropping an index on one of the tables used in the procedure
- or doing EXEC WITH RECOMPILE.
-
- The difference is between simple recompilation and re-resolution. Re-resolution
- happens when one of the tables changes in such a way that the query trees
- stored in sysprocedures may be invalid. The datatypes, column offsets, object
- ids or other parts of the tree may change. In this case, the server must
- re-allocate some of the query tree nodes. The old nodes are not de-allocated
- (there is no way to do this within a single procedure header), so the procedure
- grows. In time, trying to execute the stored procedure will result in a 703
- error about exceeding the 64 page limit for a query.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.10: How do I manipulate varbinary columns?
-
- -------------------------------------------------------------------------------
-
- The question was posed - How do we manipulate varbinary columns, given that
- some portion - like the 5th and 6th bit of the 3rd byte - of a (var)binary
- column, needs to be updated? Here is one approach, provided by Bret Halford (
- bret@sybase.com), using stored procedures to set or clear certain bits of a
- certain byte of a field of a row with a given id:
-
- drop table demo_table
- drop procedure clear_bits
- drop procedure set_bits
- go
- create table demo_table (id numeric(18,0) identity, binary_col
- binary(20))
- go
- insert demo_table values (0xffffffffffffffffffffffffffffffffffffffff)
- insert demo_table values (0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa)
- insert demo_table values (0x0000000000000000000000000000000000000000)
- go
-
- create procedure clear_bits (
- @id numeric(18,0), -- primary key of row to be changed
- @bytenum tinyint, -- specifies which byte of binary_col to change
- @mask binary(1) -- bits to be cleared are zeroed,
- -- bits left alone are turned on
- -- so 0xff = clear all, 0xfb = clear bit 3
- )
- as
- update demo_table set binary_col =
- substring(binary_col,1,@bytenum-1)+
- convert(binary(1),
- convert(tinyint,substring(binary_col,@bytenum,1)) &
- convert(tinyint,@mask)
- )+
- substring(binary_col,@bytenum+1,20)
- from demo_table
- where id = @id
- go
-
- create procedure set_bits (
- @id numeric(18,0), -- primary key of row to be changed
- @bytenum tinyint, -- specifies which byte of binary_col to change
- @mask binary(1)) -- bits to be set are turned on
- -- bits left alone are zeroed
- -- so 0xff = set all, 0xfb = set all but 3
- )
- as
- update demo_table set binary_col =
- substring(binary_col,1,@bytenum-1)+
- convert(binary(1),
- convert(tinyint,substring(binary_col,@bytenum, 1)) |
- convert(tinyint,@mask)
- )+
- substring(binary_col,@bytenum+1,20)
- from demo_table
- where id = @id
- go
-
- select * from demo_table
- -- clear bits 2,4,6,8 of byte 1 of row 1
- exec clear_bits 1,1,0xAA
-
- -- set bits 1-8 of byte 20 of row 3
- exec set_bits 3,20,0xff
-
- -- clear bits 1-8 of byte 4 of row 2
- exec clear_bits 2,4,0xff
-
- -- clear bit 3 of byte 5 of row 2
- exec clear_bits 2,5,0x08
- exec clear_bits 2,6,0x0f
- exec set_bits 2,10,0xff
- go
-
- select * from demo_table
- go
-
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- 6.1.11: How do I remove duplicate rows from a table?
-
- -------------------------------------------------------------------------------
-
- There are a number of different ways to achieve this, depending on what you are
- trying to achieve. Usually, you are trying to remove duplication of a certain
- key due to changes in business rules or recognition of a business rule that was
- not applied when the database was originally built.
-
- Probably the quickest method is to build a copy of the original table:
-
- select *
- into temp_table
- from base_table
- where 1=0
-
- Create a unique index on the columns that covers the duplicating rows with the
- ignore_dup_key attribute. This may be more columns that the key for the table.
-
- create unique index temp_idx
- on temp_table(col1, col2, ..., colN)
- with ignore_dup_key
-
- Now, insert base_table into temp_table.
-
- insert temp_table
- select * from base_table
-
- You probably want to ensure you have a very good backup of the base_table at
- this point, coz your going to clear it out! You will also want to check to
- ensure that the temp_table includes the rows you need. You also need to ensure
- that there are no triggers on the base table (remember to keep a copy!) or RI
- constraints. You probably do not want any of these to fire, or if they do, you
- are aware of the implications.
-
- Now you have a couple of choices. You can simply drop the original table and
- rename the temp table to the same name as the base table. Alternatively,
- truncate the table and insert from the temp_table into the original table. You
- would need to do this last if you did need the RI to fire on the table etc. I
- suspect that in most cases dropping and renaming will be the best option.
-
- If you want to simply see the duplicates in a table, the following query will
- help:
-
- select key1, key2, ...
- from base_table
- group by key1, key2, key3, key4, ...
- having count(*) > 1
-
- Sybase will actually allow a "select *", but it is not guaranteed to work.
-
- Back to top
-
- -------------------------------------------------------------------------------
-
- SQL Advanced bcp ASE FAQ
-
-