作者flakchen (flak)
看板Database
标题Re: [SQL ] 一些关於SQL Server的问题
时间Wed Mar 19 10:08:05 2008
※ 引述《jameswiki》之铭言:
: <前面吃光光了..太长了XD>
: Well,蛮多人讨论的..呵,
: 用guid,newid()来产生值做为PK的目的很多,
: 其实这种用Guid来做PK键的讨论很多, 是否合用,见人见智,
: flack大大,您资料表上千万上亿笔的,或许不合用,
: 不过换成小弟,但在初始规划时,我大概不会规划这种一次存上亿笔的,
: 不用比那个小小的字串跟整数当PK的效能,
: 光是left join一次上亿笔就吃不消了
: 可能我经验不足,不过若100万笔资料来做连几次left join,
: 前台的client大概都不要用了,上亿笔?..我不敢想...
的确没办法用Left Join,一定要Inner Join
而且要参考我之前讲的一个关键:Table里面不能有半个字串栏位,只能有整数或Datetime
(Datetime只有八位元)
: (不要跟我说所有栏位都在同一table,你从来不用left join,那又要讨论资料库规划
: -->离题了XD)
没错,面对这麽大的资料量作垂直或水平分割(Partition)是基本手段
不过还是有些场合时常要跨Partition查询时,SQL 2000以前的Partition View
就常常不能将效能最佳化,而且它能Union的资料表还有255个的上限,所以这些
可能就得将资料塞回去同一个资料表
所幸SQL 2005的Partition Table终於比较强一点,而可支援的Partition数目
也比较多,但这也使资料表规划技巧更形重要
: 我引述下面网址的文章
: http://blog.miniasp.com/post/2008/01/08/The-Gospel-of-the-GUID.aspx
: 其实我自己用久了,发现其实好处不止如此了..不一一列举,就以这份文章讨论吧
感谢您提供资料参考,这里也提供一份SQL Magazine的资料
http://www.sqlmag.com/Articles/ArticleID/23449/23449.html?Ad=1
Surrogate Key vs. Natural Key
不过它是付费文章,只能请各位自己想办法,节录以下几个重点,基本上就是看资料数
如果资料行数多到不可能同时塞进记忆体,那最好让栏位宽度与Index瘦身,来尽可能
塞入更多Index到记忆体中
The kind of database platform you work from can help determine whether you
need to use a natural value or a surrogate identifier for your primary key.
The criterion that a primary key must be minimal means the fewer the columns,
the better. A single-column primary key simplifies data storage, retrieval,
and coding. SQL Server automatically creates a unique index (and, by default,
a clustered index) on any column designated as a primary key. In Figure 1's
Purchase table example, if the primary key were CustID plus PurchaseDate,
SQL Server would create a 12-byte index key. If you created a surrogate key
with an integer data type for the Purchase table's primary key, the index key
would be one-third the size. Eight bytes here and 8 bytes there don't sound
like much until you start calculating the additional I/O required for reading
and writing hundreds of thousands of records. If the records are short, you
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
can fit more records on a page that's being read into memory or written to
disk. Here's a general rule: The more records that can fit on a page, the
better the performance.
........
When people ask me which makes a better primary key—a natural value or a
surrogate identifier—I answer that the choice depends on the situation. In
all the modeling work that I've done, I've rarely been able to identify a
suitable natural key as a candidate for the job. I've been able to identify
natural search keys, both unique and nonunique. But when I apply the
prerequisite tests—is it unique? does it apply to all rows? is it minimal?
is it stable over time?—to these natural keys, most natural keys fail.
My personal preference is to use a surrogate key unless I can identify an
appropriate natural key that meets the four criteria for the primary key.
--
※ 发信站: 批踢踢实业坊(ptt.cc)
◆ From: 203.70.51.189
1F:推 jameswiki:soga,所以,像您这种资料库有上亿笔,无法用left join 03/19 11:27