View Full Version : Removing extensions from file names
NovaScotian
03-27-2005, 02:52 PM
I have a list of files (returned by a list folder "folderPath" without invisibles AppleScript command) that includes the file extensions, {File1.ext, File2.ext, ...}. I want to present these in a choose from list instruction without the file extensions, if possible.
Is there an easy way that doesn't involve biting off the last four characters of each string in a repeat loop to remove these?
bramley
03-29-2005, 03:52 PM
I think that the fastest way is below. Uses the UNIX command 'awk'. But no Applescript looping so it's quick.
set theFolder to choose folder
set theListOfContents to list folder theFolder without invisibles
set AppleScript's text item delimiters to (ASCII character 10)
set theListOfContentsAsString to theListOfContents as string
set AppleScript's text item delimiters to ""
set theResponse to do shell script "echo " & quoted form of theListOfContentsAsString & "|awk -F. '{print $1}'"
set nameList to paragraphs of theResponse
NovaScotian
03-29-2005, 07:11 PM
I think that the fastest way is below. Uses the UNIX command 'awk'. But no Applescript looping so it's quick.
set theFolder to choose folder
set theListOfContents to list folder theFolder without invisibles
set AppleScript's text item delimiters to (ASCII character 10)
set theListOfContentsAsString to theListOfContents as string
set AppleScript's text item delimiters to ""
set theResponse to do shell script "echo " & quoted form of theListOfContentsAsString & "|awk -F. '{print $1}'"
set nameList to paragraphs of theResponse
Thanks, bramley;
This is what I finally came up with myself. It expects a list and returns a list of names or extensions depending on "thePart", 0 for an extension, anything else for the name. I realize I could have saved some steps if I'd done the test inside the loop, but I started programming in 1963 and I've never broken the habit of worrying about repeated operations.
set theExtensions to getNameParts(theCals, 0)
set theNames to getNameParts(theCals, 1)
on getNameParts(theList, thePart)
set extList to {}
set nameList to {}
set ListLen to length of theList
if thePart = 0 then
repeat with k from 1 to ListLen
set thisText to item k of theList
set thisDot to offset of "." in thisText
set thisLen to length of thisText
set thisExt to text thisDot thru thisLen of thisText
set extList to extList & thisExt
end repeat
return extList
else
repeat with k from 1 to ListLen
set thisText to item k of theList
set thisDot to offset of "." in thisText
set thisName to text 1 thru (thisDot - 1) of thisText
set nameList to nameList & thisName
end repeat
return nameList
end if
end getNameParts
Reacher
03-30-2005, 12:29 PM
Provided all files are guaranteed to have extensions and only one "." in the filename, this should work too:
set fileList to list folder (choose folder) without invisibles
set myFileName to choose from list (getNames(fileList))
on getNames(fileList)
set fileNames to {}
repeat with currentFile in fileList
set AppleScript's text item delimiters to "."
set currentName to first text item of currentFile
set fileNames to fileNames & currentName
end repeat
return fileNames
end getNames
myFileName
NovaScotian
03-30-2005, 01:45 PM
Provided all files are guaranteed to have extensions and only one "." in the filename, this should work too:
set fileList to list folder (choose folder) without invisibles
set myFileName to choose from list (getNames(fileList))
on getNames(fileList)
set fileNames to {}
repeat with currentFile in fileList
set AppleScript's text item delimiters to "."
set currentName to first text item of currentFile
set fileNames to fileNames & currentName
end repeat
return fileNames
end getNames
myFileName
This works very nicely for names (since we don't use prefixes normally), and misses out on such things as "MyXYZFile.txt.sit". Hadn't thought of that because it hasn't been a problem; good call though and nice use of delimiters.
hayne
03-30-2005, 03:43 PM
In case this is helpful, I'll explain how you could do this sort of thing in a shell script. Maybe you could incorporate this by using 'do script' in AppleScript.
If you know what the suffix is, then you could use the standard UNIX command 'basename'.
E.g. (in a Terminal window), if you want to remove the suffix ".txt":
basename foo.bar.txt .txt
would give:
foo.bar
Otherwise you could use the following Bash script (which could also be performed as a Bash alias):
#!/bin/sh
# Removes the suffix (from the last dot until the end of the string)
# from each of the strings passed, one per line, via STDIN
# Cameron Hayne (macdev@hayne.net) March 2005
# Example of use:
# echo "foo.bar.txt" | remove_suffix
# will output "foo.bar" (without the quotes)
sed 's/\.[^.]*$//'
NovaScotian
03-30-2005, 04:56 PM
Otherwise you could use the following Bash script (which could also be performed as a Bash alias):
#!/bin/sh
# Removes the suffix (from the last dot until the end of the string)
# from each of the strings passed, one per line, via STDIN
# Cameron Hayne (macdev@hayne.net) March 2005
# Example of use:
# echo "foo.bar.txt" | remove_suffix
# will output "foo.bar" (without the quotes)
sed 's/\.[^\.]*$//'
Item 1 in this learning curve: how do I print out an entire man doc rather than what appears on the screen in blocks? (Reminds me of printing out Forth code in blocks of 1024 bytes).
Item 2: Hayne's 'sed' solution is so incredibly arcane/esoteric/cryptic it's scary. The man pages for sed reveal: sed = a stream editor that can read a file or an input "stream" and execute a script on the contents. Must be the "guts" of search and replace tools and probably the guts of several osaxen that will do search and replace.
sed understands regex which I assume I'm seeing here. The expression is in single quotes, I assume, rather than the single quotes being part of it. The leading "s" is a regular expression function to substitute the replacement string in front of the "s" for instances of the regex pattern that follows. With nothing in front of the "s", it says to sed "put nothing where you find this pattern".
If that's all correct, then (since there's no man entry for regex) what does "/\.[^\.]*$//" mean? I'm guessing, / = escape the backslash following, then \. must indicate "look for a period", I'm guessing [^\.] means do it some more, * must be a wildcard, $ is probably "all the way to the end", and I can't imagine what the // is.
Can you (will you, I mean; I know you can) explain further, hayne?
robot_guy
03-30-2005, 08:09 PM
This works very nicely for names (since we don't use prefixes normally), and misses out on such things as "MyXYZFile.txt.sit".
set fileList to list folder (choose folder) without invisibles
set myFileName to choose from list (getNames(fileList))
on getNames(fileList)
set fileNames to {}
repeat with currentFile in fileList
set AppleScript's text item delimiters to "."
try --remove last extension only
set currentName to text items 1 thru -2 of currentFile
on error --handle files with no extensions
set currentName to currentFile
end try
set fileNames to fileNames & currentName
end repeat
return fileNames
end getNames
myFileName
robot_guy
03-30-2005, 08:45 PM
Sorry, I posted the above script modification without sufficient testing; in the case of files with two extensions, it returns the first extension as a separate file in the list of files.
Replacing the red lines in the previous script with
set theParts to every text item of currentFile
set numParts to count of theParts
set currentName to item 1 of theParts
if numParts is greater than 1 then
repeat with n from 2 to (numParts - 1)
set currentName to currentName & "." & item n of theParts
end repeat
should work correctly.
mark hunte
03-31-2005, 01:32 AM
sed 's/\.[^\.]*$//'
a basic sed substitution form looks like this
sed 's/PatternToLookFor/PatterntoRelpacefirstPatternWith/'
So the three / are containers for the search and replace patterns
s tells sed to use substitution
\ escapes the .
// patten to replace is empty so it in effect removes the
found patten
hayne problerly can explain better the rest better, as I have not used them yet, but I think most of the rest you got right.
to print the Man pages
try man sed |col -b > command.txt
or you might want to look at
Bwana (http://www.bruji.com/bwana/index.html)
which opens them in a webpage for you
..
Also I am starting to add any Man pages that I call up in bwana to the New Sticky Brain 3
http://www.chronosnet.com/Products/sb_product.html
Which has a Tiger 'Spotlight' type search Menu Bar
hayne
03-31-2005, 02:54 AM
If that's all correct, then (since there's no man entry for regex) what does "/\.[^\.]*$//" mean? I'm guessing, / = escape the backslash following, then \. must indicate "look for a period", I'm guessing [^\.] means do it some more, * must be a wildcard, $ is probably "all the way to the end", and I can't imagine what the // is.
s/\.[^.]*$// broken down into component parts:
s substitute whatever matches the pattern inside the first pair of / with the string supplied inside the second pair of /
/ start of pattern
\. a dot
[^.] any character that is not a dot
* match an arbitrary number of the preceding
$ the end of the line
/ end of pattern
nothing (what is to be substituted for the characters that match the above pattern)
/ end of substitution
NovaScotian
03-31-2005, 08:43 AM
s/\.[^\.]*$// broken down into component parts:
s substitute whatever matches the pattern inside the first pair of / with the string supplied inside the second pair of /
/ start of pattern
\. a dot
[^\.] any character that is not a dot
* match an arbitrary number of the preceding
$ the end of the line
/ end of pattern
nothing (what is to be substituted for the characters that match the above pattern)
/ end of substitution
So typical of regex expressions - parsed like this they make perfect sense but to construct them from scratch (in my case) always seems to fail when tested using BBEdit's grep search feature.
NovaScotian
03-31-2005, 08:49 AM
to print the Man pages
try man sed |col -b > command.txt
or you might want to look at
Bwana (http://www.bruji.com/bwana/index.html)
which opens them in a webpage for you
..
Also I am starting to add any Man pages that I call up in bwana to the New Sticky Brain 3
http://www.chronosnet.com/Products/sb_product.html
Which has a Tiger 'Spotlight' type search Menu Bar
Thanks for both the explanation and for the Bwana link. I use StickyBrain 3 too and couldn't live without it for lots of things besides man pages. Great idea - using it for them means a quick spotlight search without starting the terminal.
I see, reading the Bwana "Read Me.rtf" that our own "hayne" (a moderator of these forums) wrote some of the scripts in Bwana.
qwerty denzel
04-08-2005, 02:33 AM
set this_folder to (choose folder with prompt "Pick the folder containing the files to process:") as string
tell application "System Events"
set these_files to every file of folder this_folder
end tell
set this_list to {}
repeat with i from 1 to the count of these_files
set this_file to (item i of these_files as alias)
set this_info to info for this_file
if visible of this_info is true and alias of this_info is false then
set this_name to name of this_info
set extension_length to length of name extension of this_info
set this_name to characters 1 thru (-extension_length - 2) of this_name as string
set end of this_list to this_name
end if
end repeat
choose from list this_list
Pieces used from 'Files of Chosen Folder' script (Under 'Iterate Items' in contextual menu).
hayne
10-23-2005, 02:48 PM
Thanks to user 'chabig' who pointed out in another thread that I didn't need (or want) the backslash inside the character class (square brackets) in the above regex. The dot (.) is not a meta-character inside a character class, so it doesn't need to be escaped with a backslash there. In fact, the presence of the backslash would have resulted in problems in the rare case where a filename actually contained a backslash.
Thus I have edited the above script to use [^.] where it used to use [^\.]
I have edited all of my posts in this thread to make this change, but I have not corrected those places in this thread where others have quoted my original (incorrect) version of the script.
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.