作者Laviathan (Garfield)
站内Database
标题[SQL ] 请问一下该如何统计一笔资料null值栏位有多少?
时间Tue Sep 23 15:03:40 2008
各位好,使用的资料库是MS SQL 2005
现在想统计每一笔资料有多少百分比的缺漏栏位,
资料形式如下:
BAN COMPANY_NAME YYYYMM CURRENT_ASSETS
======== ============ ======= ==============
xxxxxxxx 太平洋 2004/12 7292833
xxxxxxxx 太平洋 2005/12 6853064
xxxxxxxx 太平洋 2006/06 null
xxxxxxxx 太平洋 2006/06 4964290
此张表原则上要以为BAN + YYYYMM做为key
但因资料来源不保证不会重复,
所以现在去除重复资料的方式是统计每一笔record拥有最多null field的去除掉
假设像上面是四个栏位的话,第三笔资料的null field比率是25% (=1/4)就会被去掉
但是没找到SQL中有这种横向的统计方式(有没有count_row这种函数啊...),
本来想转成直的,再用count统计,但因为key重复,统计出来後却会不知道要删哪一笔
现在又想到要以暴力法的方式(使用巢状回圈去统计)
但是这样很没效率,是否较快速且比较精简的方式统计栏位呢?
自问自答...但不是解的很好看,其实就是暴力法...
declare
@strColumn varchar(max),
@intTotalColumn int,
@strTableName varchar(50),
@strSelectSQL varchar(max)
select @strColumn = '', @intTotalColumn = 0, @strTableName = 'myTable'
DECLARE cur_syscolumns cursor
for
select
sysobjects.name as Table_name
,syscolumns.id
,syscolumns.name as Field_name
,systypes.name as Type_name
,syscolumns.prec
,syscolumns.length as length
,syscolumns.isnullable as isnullable
from syscolumns
left outer join systypes on syscolumns.xtype = systypes.xtype
left outer join sysobjects on sysobjects.id = syscolumns.id
where
sysobjects.name in (@strTableName) and
systypes.name <> 'sysname'
order by
sysobjects.name, colid
open cur_syscolumns
DECLARE
@Table_name varchar(50),
@id int,
@Field_name varchar(50),
@Type_name varchar(50),
@prec int,
@length int,
@isnullable int
FETCH NEXT FROM cur_syscolumns
into
@Table_name,
@id,
@Field_name,
@Type_name,
@prec,
@length,
@isnullable
while (@@fetch_status = 0)
begin
set @strColumn = @strColumn + 'case when ' + @Field_name +
' is null then 0.0 else 1.0 end + '
set @intTotalColumn = @intTotalColumn + 1
FETCH NEXT FROM cur_syscolumns
into
@Table_name,
@id,
@Field_name,
@Type_name,
@prec,
@length,
@isnullable
end
close cur_syscolumns
deallocate cur_syscolumns
set @strColumn = @strColumn + '0 as Not_Null_Columns'
--print(@strColumn)
set @strSelectSQL = 'select ' + @strColumn + ', ' + convert(varchar(max),
@intTotalColumn) + ' as TotalColumn, * from ' +
@strTableName
--print(@strSelectSQL)
exec('select Not_Null_Columns/TotalColumn as Not_Null_Percent, * from (' +
@strSelectSQL + ') as a')
※ 编辑: Laviathan 来自: 114.45.17.32 (09/23 22:41)
※ 编辑: Laviathan 来自: 114.45.17.32 (09/23 22:47)
1F:→ Laviathan:刚刚发现SQL 2008的SSIS有一项功能就在统计资料null比率 09/29 11:33