Circumflex

classic Classic list List threaded Threaded
35 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Circumflex

Samba - linux mailing list
I was reading about manual section numbers. Then I came across '^'.
Where is it explained?

{
You can tell what sections a term falls in with man -k (equivalent to
the apropos command). It will do substring matches too (e.g. it will
show sprintf if you run man -k printf), so you need to use ^term to
limit it:
}

https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean#3587
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
That would be a glob or a regex.


> On 2017/Aug/17, at 6:25 PM, Bryan Kilgallin (iiNet) via linux <[hidden email]> wrote:
>
> I was reading about manual section numbers. Then I came across '^'. Where is it explained?
>
> {
> You can tell what sections a term falls in with man -k (equivalent to the apropos command). It will do substring matches too (e.g. it will show sprintf if you run man -k printf), so you need to use ^term to limit it:
> }
>
> https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean#3587
> --
> www.netspeed.com.au/bryan/
>
> --
> linux mailing list
> [hidden email]
> https://lists.samba.org/mailman/listinfo/linux

--
Kim Holburn
IT Network & Security Consultant
T: +61 2 61402408  M: +61 404072753
mailto:[hidden email]  aim://kimholburn
skype://kholburn - PGP Public Key on request




--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Hi Bryan,

On 08/17/17 18:25, Bryan Kilgallin (iiNet) via linux wrote:
> I was reading about manual section numbers. Then I came across '^'. Where is it explained?
>
> {
> You can tell what sections a term falls in with man -k (equivalent to the apropos command). It will do substring matches too (e.g. it will show sprintf if you run man -k printf), so you need to use ^term to limit it:
> }
>
> https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean#3587

In a regular expression the '^' (at the start) indicates that the matching expression must be at the
beginning of the examined string:

$ man -k awk
awk (1)              - pattern scanning and processing language
awk (1p)             - pattern scanning and processing language
English (3pm)        - use nice English (or awk) names for ugly punctuation variables
filefuncs (3am)      - provide some file related functionality to gawk
gawk (1)             - pattern scanning and processing language
igawk (1)            - gawk with include files
readdir (3am)        - directory input parser for gawk
rwarray (3am)        - write and read gawk arrays to/from files
states (1)           - awk alike text processing tool
time (3am)           - time functions for gawk

$ man -k ^awk
awk (1)              - pattern scanning and processing language
awk (1p)             - pattern scanning and processing language
states (1)           - awk alike text processing tool
--
Eyal Lebedinsky ([hidden email])

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Oh, apparently man -k does regex

Here are some basics of regex pattern matching:

http://webagility.com/posts/the-basics-of-regex-explained 

It's a very powerful system of pattern matching that has a high learning curve.

> On 2017/Aug/17, at 7:00 PM, Kim Holburn via linux <[hidden email]> wrote:
>
> That would be a glob or a regex.
>
>
>> On 2017/Aug/17, at 6:25 PM, Bryan Kilgallin (iiNet) via linux <[hidden email]> wrote:
>>
>> I was reading about manual section numbers. Then I came across '^'. Where is it explained?
>>
>> {
>> You can tell what sections a term falls in with man -k (equivalent to the apropos command). It will do substring matches too (e.g. it will show sprintf if you run man -k printf), so you need to use ^term to limit it:
>> }
>>
>> https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean#3587
>> --
>> www.netspeed.com.au/bryan/
>>
>> --
>> linux mailing list
>> [hidden email]
>> https://lists.samba.org/mailman/listinfo/linux
>
> --
> Kim Holburn
> IT Network & Security Consultant
> T: +61 2 61402408  M: +61 404072753
> mailto:[hidden email]  aim://kimholburn
> skype://kholburn - PGP Public Key on request
>
>
>
>
> --
> linux mailing list
> [hidden email]
> https://lists.samba.org/mailman/listinfo/linux

--
Kim Holburn
IT Network & Security Consultant
T: +61 2 61402408  M: +61 404072753
mailto:[hidden email]  aim://kimholburn
skype://kholburn - PGP Public Key on request




--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks, Kim:

> That would be a glob or a regex.

"Parameter expansion (Globbing)

When an argument for a program is given on the commandline, it undergoes
the process of parameter expansion before it is sent on to the command."

http://fishshell.com/docs/current/index.html#expand

"regular expression

1.   (text, operating system)   (regexp, RE) One of the wild card
patterns used by Perl and other languages, following Unix utilities such
as grep, sed, and awk and editors such as vi and Emacs. Regular
expressions use conventions similar to but more elaborate than those
described under glob."

http://foldoc.org/regular%20expression

Any intro tutes on these concepts?
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks, Eyal:

> In a regular expression the '^' (at the start) indicates that the
> matching expression must be at the
> beginning of the examined string:

{
More special characters ^$

Used outside of the square brackets the ^ and $ sign will have new
meanings. And yes, this is a big part of why Regex is a bit confusing to
start up with. Instead of the ^ meaning “not these characters” it means
“start of the string”.
}

http://webagility.com/posts/the-basics-of-regex-explained

I could use drill and practice exercises.
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list


On 17/08/17 22:43, Bryan Kilgallin (iiNet) via linux wrote:

> Thanks, Eyal:
>
>> In a regular expression the '^' (at the start) indicates that the
>> matching expression must be at the
>> beginning of the examined string:
>
> {
> More special characters ^$
>
> Used outside of the square brackets the ^ and $ sign will have new
> meanings. And yes, this is a big part of why Regex is a bit confusing to
> start up with. Instead of the ^ meaning “not these characters” it means
> “start of the string”.
> }
>
> http://webagility.com/posts/the-basics-of-regex-explained
>
> I could use drill and practice exercises.

I can highly recommend "Mastering Regular Expressions" by Jeffrey E. F.
Friedll.

You'll probably find your text editor includes RegEx support for Find
and Replace (KWrite certainly does) - which is a good place to practise.

Here's a list of basic RegEx Special Characters, Quantifiers and
Metacharacters. Note that the formatting is lost when I copied it from
my personal wiki; and that there is more than one type of RegEx.... :)
See man regex for that!
(if the following is unreadable there's a PDF version of the original at
https://scottferguson.com.au/uploads/files/regex.pdf for a short time)

RegEx

A regular expression, regex or regexp (sometimes called a rational
expression) is, in theoretical computer science and formal language
theory, a sequence of characters that define a search pattern, mainly
for use in pattern matching with strings, or using a string searching
algorithm, i.e. “find and replace”-like operations. The concept arose in
the 1950s, when the American mathematician Stephen Cole Kleene
formalised the description of a regular language, and came into common
use with the Unix text processing utilities ed, an editor, and grep, a
filter.
abc… Letters
123… Digits
\d Any Digit
\D Any Non-digit character
. Any Character
\. Period
[abc] Only a, b, or c
[^abc] Not a, b, nor c
[a-z] Characters a to z
[0-9] Numbers 0 to 9
\w Any Alphanumeric character
\W Any Non-alphanumeric character
{m} m Repetitions
{m,n} m to n Repetitions
* Zero or more repetitions
+ One or more repetitions
? Optional character
\s Any Whitespace
\S Any Non-whitespace character
^…$ Starts and ends
(…) Capture Group
(a(bc)) Capture Sub-group
(.*) Capture all
(abc|def) Matches abc or def
Metacharacter Name Matches
. dot any one character
[…] character class any character listed
[^…] negated character class any character not listed
^ caret the position at the start of the line
$ dollar the position at the end of the line
\< backslash less-than 1)the position at the start of a word
\> backslash greater-than 2)the position at the end of a word
| or, bar matches either expression it separates
(…) parentheses used to limit scope of |, plus additional uses
Quantifiers
        Minimum required Maximum to try Meaning
? none 1 one allowed; none required (“one optional”)
* none no limit unlimited allowed; none required (“any amount OK”)
+ 1 no limit unlimited allowed; one required (“at least one”)


Kind regards

--
    A: Because we read from top to bottom, left to right.
    Q: Why should I start my reply below the quoted text?

    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?

    A: The lost context.
    Q: What makes top-posted replies harder to read than bottom-posted?

    A: Yes.
    Q: Should I trim down the quoted part of an email to which I'm reply

http://www.idallen.com/topposting.html

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Hi Brian,  

Actually, programmers don't tend to say circumflex.  I usually use caret but there are other names.  This: ê or this: sûr is a circumflex.

https://blog.codinghorror.com/ascii-pronunciation-rules-for-programmers/

http://ascii-table.com/pronunciation-guide.php

Kim


> On 2017/Aug/17, at 9:09 PM, Kim Holburn via linux <[hidden email]> wrote:
>
> Oh, apparently man -k does regex
>
> Here are some basics of regex pattern matching:
>
> http://webagility.com/posts/the-basics-of-regex-explained 
>
> It's a very powerful system of pattern matching that has a high learning curve.
>
>> On 2017/Aug/17, at 7:00 PM, Kim Holburn via linux <[hidden email]> wrote:
>>
>> That would be a glob or a regex.
>>
>>
>>> On 2017/Aug/17, at 6:25 PM, Bryan Kilgallin (iiNet) via linux <[hidden email]> wrote:
>>>
>>> I was reading about manual section numbers. Then I came across '^'. Where is it explained?
>>>
>>> {
>>> You can tell what sections a term falls in with man -k (equivalent to the apropos command). It will do substring matches too (e.g. it will show sprintf if you run man -k printf), so you need to use ^term to limit it:
>>> }
>>>
>>> https://unix.stackexchange.com/questions/3586/what-do-the-numbers-in-a-man-page-mean#3587
>>> --
>>> www.netspeed.com.au/bryan/
>>>
>>> --
>>> linux mailing list
>>> [hidden email]
>>> https://lists.samba.org/mailman/listinfo/linux
>>
>> --
>> Kim Holburn
>> IT Network & Security Consultant
>> T: +61 2 61402408  M: +61 404072753
>> mailto:[hidden email]  aim://kimholburn
>> skype://kholburn - PGP Public Key on request
>>
>>
>>
>>
>> --
>> linux mailing list
>> [hidden email]
>> https://lists.samba.org/mailman/listinfo/linux
>
> --
> Kim Holburn
> IT Network & Security Consultant
> T: +61 2 61402408  M: +61 404072753
> mailto:[hidden email]  aim://kimholburn
> skype://kholburn - PGP Public Key on request
>
>
>
>
> --
> linux mailing list
> [hidden email]
> https://lists.samba.org/mailman/listinfo/linux

--
Kim Holburn
IT Network & Security Consultant
T: +61 2 61402408  M: +61 404072753
mailto:[hidden email]  aim://kimholburn
skype://kholburn - PGP Public Key on request




--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks Kim:

> Oh, apparently man -k does regex

I am trying to learn these useful conventions.

> Here are some basics of regex pattern matching:
>
> http://webagility.com/posts/the-basics-of-regex-explained
>
> It's a very powerful system of pattern matching that has a high learning curve.

I also found the following.
http://www.regular-expressions.info/tutorial.html

--
www.netspeed.com.au/bryan/


--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks Scott:

>> I could use drill and practice exercises.
>
> I can highly recommend "Mastering Regular Expressions" by Jeffrey E. F.
> Friedll.

Being an impoverished pensioner, I avoid buying stuff!
An ACT library catalogue search turned up books on Java, PHP, and
link-building.
https://www.librarycatalogue.act.gov.au/ipac20/ipac.jsp?profile

> You'll probably find your text editor includes RegEx support for Find
> and Replace (KWrite certainly does) - which is a good place to practise.

Gedit doesn't--though I found this for Vim.
http://www.vimregex.com/

> A regular expression, regex or regexp (sometimes called a rational
> expression) is, in theoretical computer science and formal language
> theory, a sequence of characters that define a search pattern, mainly
> for use in pattern matching with strings, or using a string searching
> algorithm, i.e. “find and replace”-like operations.
Perhaps I need a wall-chart, to help with rote learning.
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks, Kim:

> Actually, programmers don't tend to say circumflex.  I usually use caret but there are other names.

ASCII 94 seems referred to interchangeably as "circumflex accent" or
"caret". I wanted specifically its use in Linux.

"In regular expressions, the caret is used to mark the beginning of a
string, or the beginning of a line within that string (depending on the
regular expression dialect and specified options); if it begins a
character class, it indicates that the inverse of the class is to be
matched."

https://en.wikipedia.org/wiki/Caret#Programming_languages

> https://blog.codinghorror.com/ascii-pronunciation-rules-for-programmers/
>
> http://ascii-table.com/pronunciation-guide.php

I have bookmarked those URLs.
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Scott:

I have been practising, using egrep on a text file.

"[^abc] Not a, b, nor c"

But I couldn't get negation working!
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
On 08/18/17 23:25, Bryan Kilgallin (iiNet) via linux wrote:
> Scott:
>
> I have been practising, using egrep on a text file.
>
> "[^abc]     Not a, b, nor c"
>
> But I couldn't get negation working!

Can you show what is not working?

The simple expression "[^abc]" will match any line that includes at least one character
that it "Not a, b, nor c", which usually is everything line a common text file...

To test just it, use
$ echo 'a' | grep '[^abc]'
$ echo 'd' | grep '[^abc]'
d

cheers

--
--
Eyal Lebedinsky ([hidden email])

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
I have been trying out egrep on an old text file tabulating American
bulletin boards such as the following.

{
/ Number / Name           /Type  /Baud rate /Type of BBS

---------------------------------------------------------------

221-0774 /CCIS Hopewell   / IBM  / 300/1200 /General BBS
}

So I have been applying search terms in a RegEx document supplied by
Scott. Therein I read
        "\d Any Digit", and also
        "[0-9] Numbers 0 to 9".

I can get one of those phone-number records by searching thus.
        egrep '^[0-9]'
But the following selects nothing!
        egrep '^\d'
What went wrong?
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
On 19/08/17 22:50, Bryan Kilgallin (iiNet) via linux wrote:

> I have been trying out egrep on an old text file tabulating American bulletin boards such as the following.
>
> {
> / Number / Name           /Type  /Baud rate /Type of BBS
>
> ---------------------------------------------------------------
>
> 221-0774 /CCIS Hopewell   / IBM  / 300/1200 /General BBS
> }
>
> So I have been applying search terms in a RegEx document supplied by Scott. Therein I read
>      "\d    Any Digit", and also
>      "[0-9]    Numbers 0 to 9".
>
> I can get one of those phone-number records by searching thus.
>      egrep '^[0-9]'
> But the following selects nothing!
>      egrep '^\d'
> What went wrong?

I think that the '\d' notation is a Perl extension, so try
        grep -P ^\d'

Note that 'egrep' is short for 'grep -E' (extended regex) and you cannot use '-E' and '-P'
together as they have conflicting rules.

--
Eyal Lebedinsky ([hidden email])

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
Thanks Eyal:

> I think that the '\d' notation is a Perl extension, so try
>      grep -P ^\d'>

Yes, that worked!

> Note that 'egrep' is short for 'grep -E' (extended regex) and you cannot
> use '-E' and '-P'
> together as they have conflicting rules.

I recall reading that egrep had a limited implementation of regex. For
what should grep and what egrep be used?
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks, Eyal:

> I think that the '\d' notation is a Perl extension, so try
>      grep -P ^\d'

I was confused among versions of regular expressions! Discovering by
trial and error. So to search for `-' I had to escape it with `\'.

What Web resource explains standard, extended, and Perl variants?

> Note that 'egrep' is short for 'grep -E' (extended regex) and you cannot
> use '-E' and '-P'
> together as they have conflicting rules.
Grep highlighted searched characters--while egrep did not.
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list


On 20/08/17 17:33, Bryan Kilgallin (iiNet) via linux wrote:
> Thanks, Eyal:
>
>> I think that the '\d' notation is a Perl extension, so try
>>      grep -P ^\d'

Match on start of line followed by any digit
grep -P '^\d'
or
grep -P "^\d"

(I suspect the "" is incorrect, though it works).

>
> I was confused among versions of regular expressions! Discovering by
> trial and error. So to search for `-' I had to escape it with `\'.
>
> What Web resource explains standard, extended, and Perl variants?

https://en.wikipedia.org/wiki/Regular_expression

or, look at the book I sent you (Chapter 7 is Perl, 8 is Java, 9 is .NET)
Page 144 covers the different engines.

Note that page 6 - 22 is just about egrep

<snipped>

Use curl and grep to get all the links from the wikipedia page"=
curl https://en.wikipedia.org/wiki/Regular_expression 2>&1 | grep -o -E
'href="([^"#]+)"' | cut -d'"' -f2

Re: searching man files. "man -k" is the equivalent of "apropos". To
brute force search man files use "man -K $searchterm" (warning - it will
may take a while. Or just open a man file and type "/" followed by the
searchterm - then press Enter for jump to the first match (if there is one).


Kind regards

--
    A: Because we read from top to bottom, left to right.
    Q: Why should I start my reply below the quoted text?

    A: Because it messes up the order in which people normally read text.
    Q: Why is top-posting such a bad thing?

    A: The lost context.
    Q: What makes top-posted replies harder to read than bottom-posted?

    A: Yes.
    Q: Should I trim down the quoted part of an email to which I'm reply

http://www.idallen.com/topposting.html
--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list

> On 2017/Aug/20, at 5:33 PM, Bryan Kilgallin (iiNet) via linux <[hidden email]> wrote:
>
> Thanks, Eyal:
>
>> I think that the '\d' notation is a Perl extension, so try
>>     grep -P ^\d'
>
> I was confused among versions of regular expressions! Discovering by trial and error. So to search for `-' I had to escape it with `\'.

to search for - in grep you can also use '[-]'

Regular expressions are confusing.  Every language and program has a slightly different set.  And some like grep have 3 different sets!  The full perl reference is here:

http://perldoc.perl.org/perlre.html

Vim uses a different syntax for regex: http://www.vimregex.com/

Python is different again: https://docs.python.org/2/howto/regex.html

Kim

>
> What Web resource explains standard, extended, and Perl variants?
>
>> Note that 'egrep' is short for 'grep -E' (extended regex) and you cannot use '-E' and '-P'
>> together as they have conflicting rules.
> Grep highlighted searched characters--while egrep did not.
> --
> www.netspeed.com.au/bryan/
>
> --
> linux mailing list
> [hidden email]
> https://lists.samba.org/mailman/listinfo/linux

--
Kim Holburn
IT Network & Security Consultant
T: +61 2 61402408  M: +61 404072753
mailto:[hidden email]  aim://kimholburn
skype://kholburn - PGP Public Key on request




--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
Reply | Threaded
Open this post in threaded view
|

Re: Circumflex

Samba - linux mailing list
In reply to this post by Samba - linux mailing list
Thanks, Scott:

> Match on start of line followed by any digit
> grep -P '^\d'
> or
> grep -P "^\d"
>
> (I suspect the "" is incorrect, though it works).

{Sometimes features such as parameter expansion and character escapes
get in the way. When that happens, the user can write a parameter within
quotes, either ' (single quote) or " (double quote). There is one
important difference between single quoted and double quoted strings:
When using double quoted string, variable expansion still takes place.
Other than that, no other kind of expansion (including brace expansion
and parameter expansion) will take place, the parameter may contain
spaces, and escape sequences are ignored. The only backslash escape
accepted within single quotes is \', which escapes a single quote and
\\, which escapes the backslash symbol. The only backslash escapes
accepted within double quotes are \", which escapes a double quote, \$,
which escapes a dollar character, \ followed by a newline, which deletes
the backslash and the newline, and lastly \\, which escapes the
backslash symbol. Single quotes have no special meaning within double
quotes and vice versa.}

http://fishshell.com/docs/current/index.html

>> What Web resource explains standard, extended, and Perl variants?
>
> https://en.wikipedia.org/wiki/Regular_expression

I have bookmarked that page.

> or, look at the book I sent you

I am on page twenty five of it.

Cheers,
B.
--
www.netspeed.com.au/bryan/

--
linux mailing list
[hidden email]
https://lists.samba.org/mailman/listinfo/linux
12