問題描述
我有一個包含這樣的用戶名的表.
I have a table that has user names like this.
Name
-----
Smith-Bay, Michael R.
Abbott, David Jr.
Actor, Cody
Agular, Stephen V.
我需要名字看起來像:
Last First MI
-------------------------
Smith-Bay Michael R
Abbott David Jr
Actor Cody
Agular Stephen V
我有以下 SQL 將名稱拆分為第一個和最后一個:
I have the following SQL that splits the name into first and last:
select vl.lastname, vf.firstname
from users as t cross apply
(values (left(t.name, charindex(', ', t.name)), stuff(t.name, 1,
charindex(', ', t.name) + 1, ''))) vl(lastname, rest)
cross apply
(values (left(vl.rest, charindex(' ', vl.rest + ' ')))) vf(firstname)
order by vl.lastname
如何應用另一個交叉應用來提取名字減去末尾句點之后的所有內容?
How can I apply another cross apply to extract basically everything after the first name minus the period at the end?
推薦答案
我不得不多次這樣做,因為我經常使用 ETL 并且由于數據存儲不佳而需要從字符串中提取項目或者只是簡單地從報告中提取數據.數據并不總是很好地打包在單獨的列中,我發現自己出于各種原因解析數據.希望您解析的數據是一致的.不一致的數據要么使這變得更加困難,要么不可能.如果您的名字完全符合您建議的格式,那么我下面的方法將非常有效.我在很多場合都用過它.
I've had to do this on many occasions as I work ETL on a regular basis and either need to extract items from within strings due to either bad data storage or just simply having to pull the data from reports. The data isn't always nicely packaged in separate columns and I find myself parsing data for all sorts of reasons. Hopefully the data you are parsing is consistent. Inconsistent data either makes this much more difficult or impossible. If you can rely on your names being exactly in the format you suggested my method below will work perfectly. I've used it on many occasions.
下面的方法我在許多不同的語言中都使用過.我已經在 MS ACCESS、Microsoft SSMS 和 C# 中完成了這項工作.我的例子來自 Oracle.
The method below I've used in many different languages. I've done this in MS ACCESS, Microsoft SSMS and C#. My example is out of Oracle.
基本思想是:
找到分隔你的 First_Name、Last_Name 和 Middle_Initial 字符串的字符位置
.
使用獲得的字符位置將字符串提取到新列中
.
代碼如下:
WITH character_pos AS
(
/* First we need the character positions for spaces, commas and the period for the middle initial */
SELECT name
/* Find 1st Space in the name so we can extract the first name from the string */
, instr(name, ', ') AS comma_1st_space_pos
/* Find 2nd Space in the name so we can extract the last name from the string */
, instr(name, ' ', 1, 2) AS comma_2nd_space_pos
/* Get the Length of the last name so we know how many characters the substr function should extract */
, instr(name, ' ', 1, 2) - (instr(name, ', ') + 2) AS last_name_length
/* Find period in the name so we can extract the Middle Initial should it exist */
, instr(name, '.') AS period_pos
, (instr(name, '.') - 1) - instr(name, ' ', 1, 2) AS middle_initial_length
FROM parse_name
) /* END character_pos CTE */
SELECT name
, substr(name, 0, comma_1st_space_pos -1) AS last_name
, CASE WHEN period_pos = 0 THEN substr(name, comma_1st_space_pos + 2)
ELSE substr(name, comma_1st_space_pos + 2, last_name_length)
END AS first_name
, substr(name, comma_2nd_space_pos + 1, middle_initial_length) AS middle_initial
, comma_1st_space_pos, comma_2nd_space_pos, last_name_length
, period_pos, middle_initial_length
FROM character_pos
;
我使用 CTE 只是為了在實際提取之外組織字符位置,但這一切都可以在一個 SQL 語句中完成.
I used a CTE just to organize the character positions outside of the actual extraction however this all could be done in one single SQL Statement.
基本上,這證明除了一些簡單的字符串解析函數之外,您不需要任何額外的東西.您只需要 Instring 和 Substring,它們通常以任何語言提供.沒有存儲過程,沒有臨時表,也不需要額外的外部代碼.除非有超出原始問題范圍的其他因素導致必須使用 SQL 以外的任何其他內容.
Basically this proves you don't need anything extra outside of just some simple string parsing functions. All you need is Instring and Substring which are usually available in any language. No Stored procedures, no temp table and no extra outside code needed. Unless there are other factors outside the scope of the original question that makes it necessary to use anything other than just SQL.
這篇關于SQL:交叉應用將名稱拆分為名字、姓氏和 MI的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!