Recent Changes - Search:

Oktatás

* Programozás 1
  + feladatsor
  + GitHub oldal

* Szkriptnyelvek
  + feladatsor
  + quick link

Teaching

* Programming 1 (BI)
  ◇ exercises
  ◇ quick link

* Scripting Languages
  ◇ exercises
  ◇ quick link

teaching assets


Félévek

* aktuális (2023/24/2)
* archívum


Linkek

* kalendárium
   - munkaszüneti napok '20
* tételsorok
* jegyzetek
* szakdolgozat / PhD
* ösztöndíjak
* certificates
* C lang.
* C++
* C#
* Clojure
* D lang.
* Java
* Nim
* Scala


[ edit | logout ]
[ sandbox | passwd ]

20180405j

string.Split

In Python:

>>> s = "this is            a test         "
>>> s.split()
['this', 'is', 'a', 'test']
>>>

In C#:

csharp> var s = "this is            a test         "
csharp> s.Split()
{ "this", "is", "", "", "", "", "", "", "", "", "", "", "", "a", "test", "", "", "", "", "", "", "", "", "" }

Hmm, not exactly what we expected… But here is the remedy:

csharp> var s = "this is            a test         "
csharp> s.Split(new char[0], StringSplitOptions.RemoveEmptyEntries);
{ "this", "is", "a", "test" }

Now let's make a custom extension of it:

// split by whitespaces and remove empty entries (like Python's s.split())
public static string[] SplitAndRemoveEmptyEntries(this string s)
{
    return s.Split(new char[0], StringSplitOptions.RemoveEmptyEntries);
}

Some tests:

const string s4 = "   aa     bb    \t    cc        dd\n    ";
Assert.Equal(new[] {"aa", "bb", "cc", "dd"}, s4.SplitAndRemoveEmptyEntries());

See the current version here.

Update (20221203):

You can also use regular expressions:

csharp> using System.Text.RegularExpressions                      
csharp> string s = "   aa     bb    \t    cc        dd\n    "    
csharp> Regex.Split(s, "\\s+")                                    
{ "", "aa", "bb", "cc", "dd", "" }

Notice the empty strings at the beginning and at the end! Here is how to get rid of them:

csharp> using System.Text.RegularExpressions                  
csharp> string s = "   aa     bb    \t    cc        dd\n    "
csharp> Regex.Split(s.Trim(), "\\s+")
{ "aa", "bb", "cc", "dd" }

First trim the string to remove the leading and trailing whitespaces.

However, here is the version that's the easiest to remember:

csharp> string s = "   aa     bb    \t    cc        dd\n    "
csharp> s.Split().Where(p => p.Length > 0)  
{ "aa", "bb", "cc", "dd" }

You can also add .ToList() at the end to get a list of strings.

Cloud City

  

Blogjaim, hobbi projektjeim

* The Ubuntu Incident
* Python Adventures
* @GitHub
* heroku
* extra
* haladó Python
* YouTube listák


Debrecen | la France


[ edit ]

Edit - History - Print *** Report - Recent Changes - Search
Page last modified on 2023 November 24, 11:26