Any SQL Query experts?

Soldato
Joined
28 Sep 2008
Posts
14,176
Location
Britain
I've always enjoyed SQL but this one has go me thinking.

It's SharePoint related, in that, I'm querying a SharePoint database directly from SQL management studio.

The query is based around the document naming convention within the organisation.

It must be:

YYYYMMDD_name.doc (etc)

Example:
20100913-Eat A Sandwich.doc

Only the following characters are allowed:
A-Z, a-z, 0-9, hyphen, round brackets, space, underscore.

So, I'm trying to right a query that will find any documents that don't conform to the naming convention AND that contain illegal characters:

Example:

20100913-Eat A Sandwich$$.doc (although in the right format, contains illegal charaters)
Eat A Sandwich_20100913-.doc (not in the right format)

Here's what I have so far:
(BTW, for those not familiar with SharePoint tables, LeafName is the field that contains the document name)

Code:
- Illegal characters
SELECT *
  FROM Docs
 WHERE LeafName LIKE '%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
  AND LeafName != ''
-- Start
AND LeafName NOT LIKE '[0-2][019][0-9][0-9]%'

As it stands, that query returns the docs with illegal characters just fine, but not those with the correct naming format that might contain illegal characters.
 
Last edited:
Don't your ANDs need to be ORs?

edit: Or is that query just for the illegal characters bit and you haven't wrote the format part yet?

Maybe? I want the query to do illegal characters and the format part.

Try using RegExr to test your regex. If you put a lot of valid/invalid filenames in the lower box you'll instantly see what is and isn't hit as you change the expression.

I know that SQL regex is a little different to javascript, but the similarities should allow you to test this.

Ok, thanks I'll take a look
 
Think this should do it. This returns those with the correct naming convention (in terms of the preceeding date) which also contain an invalid character.
Just change LIKE to NOT LIKE if need be

DECLARE @test VARCHAR(200)
SET @test = '20100913-Eat A Sandw£ich.doc'

SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

--could treat each part seperately depending on what you want
SELECT @test
WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
AND @test LIKE '%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

Wow, cool, thanks. Problem is, that only returns errors that meet the format but that contain illegal characters. I also need it to return all docs which don't meet the format, in the same result.
 
Yeah, we've dumped to new tables now. Well, to clarify, we created a massive migration catalogue with inner and outer joins galore. Now we just query the migration catalogue.

I got the query to work, so well in fact that only 10% of their 450k documents actually conform....:(
 
Back
Top Bottom