Any SQL Query experts?

Soldato
Joined
28 Sep 2008
Posts
14,184
Location
Britain
I've always enjoyed SQL but this one has go me thinking.

It's SharePoint related, in that, I'm querying a SharePoint database directly from SQL management studio.

The query is based around the document naming convention within the organisation.

It must be:

YYYYMMDD_name.doc (etc)

Example:
20100913-Eat A Sandwich.doc

Only the following characters are allowed:
A-Z, a-z, 0-9, hyphen, round brackets, space, underscore.

So, I'm trying to right a query that will find any documents that don't conform to the naming convention AND that contain illegal characters:

Example:

20100913-Eat A Sandwich$$.doc (although in the right format, contains illegal charaters)
Eat A Sandwich_20100913-.doc (not in the right format)

Here's what I have so far:
(BTW, for those not familiar with SharePoint tables, LeafName is the field that contains the document name)

Code:
- Illegal characters
SELECT *
  FROM Docs
 WHERE LeafName LIKE '%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
  AND LeafName != ''
-- Start
AND LeafName NOT LIKE '[0-2][019][0-9][0-9]%'

As it stands, that query returns the docs with illegal characters just fine, but not those with the correct naming format that might contain illegal characters.
 
Last edited:
Don't your ANDs need to be ORs?

edit: Or is that query just for the illegal characters bit and you haven't wrote the format part yet?
 
Last edited:
Don't your ANDs need to be ORs?

edit: Or is that query just for the illegal characters bit and you haven't wrote the format part yet?

Maybe? I want the query to do illegal characters and the format part.

Try using RegExr to test your regex. If you put a lot of valid/invalid filenames in the lower box you'll instantly see what is and isn't hit as you change the expression.

I know that SQL regex is a little different to javascript, but the similarities should allow you to test this.

Ok, thanks I'll take a look
 
Think this should do it. This returns those with the correct naming convention (in terms of the preceeding date) which also contain an invalid character.
Just change LIKE to NOT LIKE if need be

DECLARE @test VARCHAR(200)
SET @test = '20100913-Eat A Sandw£ich.doc'

SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

--could treat each part seperately depending on what you want
SELECT @test
WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
AND @test LIKE '%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
 
Last edited:
Think this should do it. This returns those with the correct naming convention (in terms of the preceeding date) which also contain an invalid character.
Just change LIKE to NOT LIKE if need be

DECLARE @test VARCHAR(200)
SET @test = '20100913-Eat A Sandw£ich.doc'

SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

--could treat each part seperately depending on what you want
SELECT @test
WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%'
AND @test LIKE '%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

Wow, cool, thanks. Problem is, that only returns errors that meet the format but that contain illegal characters. I also need it to return all docs which don't meet the format, in the same result.
 
--SQL
SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
OR @test NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'



DECLARE @test VARCHAR(200)
--incorrect format, should return 1 row
SET @test = '0100913-Eat A Sandwich.doc'

SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
OR @test NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

--dodgy char check, should return 1 row
SET @test = '20100913-Eat A Sandwic$h.doc'
SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
OR @test NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'

--correct format - should return no rows
SET @test = '20100913-Eat A Sandwich.doc'
SELECT @test WHERE @test LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[^-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
OR @test NOT LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]%[-a-zA-Z0-9()_ ]%.[a-zA-Z0-9][a-zA-Z0-9][a-zA-Z0-9]%'
 
Its returning that result because it doesnt match the format i.e. its 0100913 instead of 20100913.
The variable @test is set to different values in my SQL to test.
i.e.
SET @test = '0100913-Eat A Sandwich.doc'
then
SET @test = '20100913-Eat A Sandwic$h.doc'
then
SET @test = '20100913-Eat A Sandwich.doc'
 
Mainly because it isn't supported by MS, locks on the DB and also you can deadlock some of the SharePoint transactions like timer stuff.

I'm not sure about how feasible this is but it might be a better option to dump stuff into other tables and query them.

Maybe the above issues were only based around 2007 as I know it is fully supported to query the usage/logging database in SP2010 which comes with some custom views pre-built.
 
Yeah I just looked up more info and it is only querying the Logging DB in SP2010 that is supported. You are probably better using something like the SharePoint object model or Linq to SharePoint to pull out what you need.
 
just use nolock and/or read uncommitted isolation level and you should be fine. If the table is large and the DB heavily used then it might add additional load on it (depending on what indexes sharepoint has) so best to run it at a quite time as the SQL above will probably end up scanning the entire Docs table.
 
Back
Top Bottom