Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Names with whitespaces #8542

Closed
p-e-timoshenko opened this issue Feb 10, 2016 · 29 comments
Closed

Proposal: Names with whitespaces #8542

p-e-timoshenko opened this issue Feb 10, 2016 · 29 comments
Labels

Comments

@p-e-timoshenko
Copy link

Is there a technical reason why white spaces aren't allowed in names of constants, variables, methods, classes and namespaces or is it in accordance with any convention? This question can also be asked in relation to the digits that can’t be placed at the beginning of the name.

The main reason to follow a naming convention is to reduce the effort needed to read and understand source code. The developers could focus on more important issues than arguing over syntax and naming standards. For this reason there is a strict rule that names are usually case-sensitive. A name can be any legal identifier — an unlimited-length sequence of Unicode letters and digits, beginning with a letter or the underscore character. White space in names is not permitted.

The last rule is used not only due to the difficulties with code understanding. The compiler needs to find out the meaning of words. It works on a “State Machine” method, and it needs to distinguish key words. However, there are languages like SQL or Maple where names can contain white spaces. They are enclosed on both sides of special quotation mark characters: “[” and “]” in SQL, “`” (U+0060, grave accent) in Maple. In C# square brackets are used as indexers of collection items. But the grave accent symbol can be successfully used as enclosed symbol.

Let's try to write a code in C# using quite long names containing white spaces and see what happens.

[`Custom Serialization Attribute`]
public class `Config Data File Repository` : `Configuration Repository Interface` {
  public enum `Data Visualization Format` { `2D`, `3D` }
  public `Data Visualization Format` `Visualization Format` { get; set; } 
    = `Data Visualization Format`.`2D`;
  ... 
  private readonly string `repository file name`;

  public `Config Data File Repository`(string `repository file name`) {
    this.`repository file name` = `repository file name`;
  }

  public `Configuration Data` Load() {
    using(var fs = new `File Stream`(`repository file name`, `File Mode`.Open)) {
      return `Load From Stream`(fs);
    }
  }

  public void Save(`Configuration Data` `current configuration`) {
    using (var fs = new `File Stream`(
      this.`repository file name`,
      `File Mode`.`Create New`))
    {
      `Save To Stream`(`current configuration`, fs);
    }
  }
...
  public const string `DEFAULT REPOSITORY FILE` =
    "Default Config Data File Repository.config";

  public static `Config Data File Repository` `Create Config Data File Repository`(
    string `file Name` = `Config Data File Repository`.`DEFAULT REPOSITORY FILE`) {
    return new `Config Data File Repository`(`file Name`);
  }
}

No doubt, it is possible to make the compiler to parse this code. However, the code has become much harder to read. Surprising for me, the readability of the code decreased just slightly! The first place is taken by already existing issue of too long names. There are a large number of articles (“Very Long Descriptive Names That Programming Pairs Think Provide Good Descriptions” or “Best Practices for Variable and Method Naming”) to provide some useful solutions. This issue has existed since the origin of object-oriented languages. According my point of view, while thinking about code it is useful to look at it from the other developers side who may use it. This will help not only with picking names, but with keeping your code more readable without writing a documentation and unnecessary comments. If the name is too long, then maybe this part of the code has multiple responsibilities and it should be written separately. In other words, the code should be re-factored.

Also there is a question, how to improve the readability and where to use the feature. The main areas of application might be defined as unit testing classes and classes contained internationalization string properties implemented by string resource or configuration settings sections wrapper classes. The readability could be improved due to features of IDE. For example, the names with white spaces can be highlighted by thin dotted border. This approach will allow developers to identify such names visually and quickly, and will restrict the widespread thoughtless use of names with white spaces.

Let's try to write a simple unit test in C#.

namespace `My Modern Interpreter`.`Unit Tests`{

  [`Test Fixture`]
  public class `Simple Text Operations Tests` {
    [Test] public void `Test 1: Equal To`() 
      { ... }
    [Test] public void `Test 2: Equal To When Not Equal`() 
      { ... }
    [Test] public void `3. Text Using Single Quote Intermixed With Double Quote`() 
      { ... }
    [Test] public void `4. Text Using Double Quote Intermixed With Single Quote`() 
      { ... }
    [Test] public void `5. Implicit String To Expression Conversion`() 
      { ... }
  ...
}}

The unit test code in C# has a very good readability. It doesn't need additional attributes to be displayed correctly in the test reporting window. However, it could be useful to write the attributes specified the test categories.

The often use of such a long names in the code isn’t the best way. But if there is a necessary to refer to them, it may be useful to apply the standard way to reduce such names by the method presented in natural languages. Some non-significant parts of the name can be replaced by ellipsis characters.

public static `Config `…` Repository` `Create Config Data File Repository`(
  string `file Name` = `Config`…`DEFAULT`…) {
  return new `Config `...` Repository`(`file Name`);
}
@alrz
Copy link
Member

alrz commented Feb 10, 2016

My eyes.

the code has become much harder to read. Surprising for me, the readability of the code decreased just slightly

If you argue that PascalCase is not readable enough, how do you propose an alternative that you know it is less readable?

@HaloFour
Copy link

Technically legal to the CLR but I really don't think that this is a good idea, particularly with white space. I honestly think that your escaped example is much less readable given that you have to visually parse out the specific beginning and ending escape quotes while mentally bookkeeping the whitespace which would otherwise serve as visual indicators.

@axel-habermaier
Copy link
Contributor

F# supports that, which is actually very nice for unit tests -- especially since test runners such as Resharper's already display the test names correctly.

[<Test>]
let `2 times 2 should be 4`` () =
   2 * 2 |> should be 4

Except for tests, however, this naming style should be avoided, in my opinion, as it makes things less readable.

@alrz
Copy link
Member

alrz commented Feb 10, 2016

@axel-habermaier Agreed, but when you start to use it like everywhere, it becomes less nice.

@DavidArno
Copy link

@alrz,

Then don't use it like everywhere :)

It's a really nice feature for test method names and therefore is a good proposal. The fact that bad programmers will misuse it should not be a blocker to this proposal. If we adopted that strategy, we'd have no programming languages at all!

@HaloFour
Copy link

Test reports could just as easily replace underscores with spaces. Language features shouldn't be designed around how very specific tools happen to display their data.

@DavidArno
Copy link

@HaloFour,

"Language features shouldn't be designed around tests" (slight misquote) is a dubious assertion. I start with a test ... therefore it's reasonable to expect my language of choice to at least make writing that test as easy as possible. Therefore features that enhance test writing should be considered simply on that merit, unless they adversely affect other areas of coding.

@HaloFour
Copy link

@DavidArno This feature doesn't affect how you write tests, it affects how you visualize the results, and only because the tools currently just vomit the names into their reports. There are plenty of other options that don't require modifying the entirety of the language, such as reformatting the class/method names or including free-form descriptions in the attributes.

@MgSam
Copy link

MgSam commented Feb 10, 2016

Whitespace as part of legal identifier names is useful in languages where data is a first class object (such as SQL). I think it's confusing and counterproductive in a general purpose language like C#.

@DavidArno
Copy link

@HaloFour,

[Test]
void `Test that 2 + 2 = 4`()
{
    ...
}

versus

[Test]
void TestThat2Plus2Equals4()
{
    ...
}

Expressing test methods in natural language inherently makes the test's purpose easier to determine. It's not just about the output of test results. Which is why this feature is so useful in F#, regardless of the test framework used.

@MgSam,

Part of the reasoning behind some of the feature choices for C# 7 is to make it more of a data-orientated language. Thus your argument actually counts as a plus point for this feature...

@alrz
Copy link
Member

alrz commented Feb 10, 2016

@DavidArno Do you really write all the arrange, act and assertion in your test method name?

@p-e-timoshenko
Copy link
Author

This discussion has me fun. I'm rolling on the floor :)

In languages where there is such feature, it is used where it is really needed. How often a DB table names contain spaces?

If it shouldn't give freedom, let it limit. Strong names have to match the regular expression "[_A-Za-z][_A-Za-z0-9]*". Not everyone understands the name of the variable "НеКаждыйПойметНазваниеЭтойПеременной" ^)

@bondsbw
Copy link

bondsbw commented Feb 10, 2016

This suggestion could lead to easier translations from data languages that allow spaces in their keys or names, such as JSON.

@DavidArno
Copy link

@alrz,

All the time; don't you? :)

Perhaps that wasn't a great example, but taking a real example, I find I tend to use a weird pigeon English with test method names, eg

public void WhenOptionNotValue_ResultsInExceptionIfValueRead()

Using natural language, it becomes much clearer, I feel, as to what I mean with the test:

public void `When Option<T> has no value, an exception is thrown when Value is read`()

@alrz
Copy link
Member

alrz commented Feb 11, 2016

@DavidArno I'd say the compiler should infer the whole method body right from the name. You've just discovered yet another recursively nonsensical point. Because if you write type/method/properties' names in your test method, it doesn't even support refactoring. What a disappointment. I think it should be integrated with something like #8503 so you can actually write your test in the method name and it gets compiled right there. #MakeC#GreatAgain

@alrz
Copy link
Member

alrz commented Feb 11, 2016

I do note that this'd be more of a tooling improvement rather than language change, see this blog post.

@GeirGrusom
Copy link

@p-e-timoshenko int НеКаждыйПойметНазваниеЭтойПеременной() is a valid method definition regardless of the readers ability to read it.

Anyway why can't these things be presented by metadata?

@p-e-timoshenko
Copy link
Author

@GeirGrusom
Of course, "int НеКаждыйПойметНазваниеЭтойПеременной(): is a valid definition. But we do not even think about how it's not good. Some individuals even write programs having such names. A number of such developers is very small comparing with all other.

I can list a number of cases where the strange names are generated. For example, resource property names have it.

using static Properties.Resources;
MessageBox.Show(`Do you want to quit?`);

@GeirGrusom
Copy link

I can list a number of cases where the strange names are generated. For example, resource property names have it.

But isn't that with the specific intention that those identifiers should be inaccessible from the language? Anonymous types and closures have special names generated as well which currently cannot be expressed in C#, but this proposal would suddenly allow it.

using static Properties.Resources;
MessageBox.Show(Do you want to quit?);

If you change the value to "Are you sure you want to quit?" without changing the resource name that would would be confusing like hell. There is very little reason to not call that resource ExitQuery or something similar instead.

@gafter
Copy link
Member

gafter commented Feb 11, 2016

String s = "foo";

Is that a declaration of a new variable named s, or an assignment to an existing variable named String s?

@gafter gafter closed this as completed Feb 11, 2016
@gafter gafter added Question Resolution-Answered The question has been answered labels Feb 11, 2016
@bondsbw
Copy link

bondsbw commented Feb 11, 2016

@gafter I don't think that syntax was proposed. ``String s = "foo"; would be the syntax to assign an existing variable named `String s`.

@DavidArno
Copy link

@gafter,

You have completely misunderstood this proposal. Identifiers with spaces would need to be enclosed in eg characters, so your line should be:

`String s` = foo;

Please re-open this issues and it has not been "resolution answered"

@alrz
Copy link
Member

alrz commented Feb 12, 2016

I think @gafter just answered the question.

Is there a technical reason why white spaces aren't allowed in names of constants, variables, methods, classes and namespaces or is it in accordance with any convention?

Therefore, Resolution-Answered.

@DavidArno
Copy link

@alrz,

Generally, it helps to wind people up less if one reads the whole question, rather than just the first sentence, before answering...

@alrz
Copy link
Member

alrz commented Feb 12, 2016

I understand how it feels like having +400 items in the Backlog. I don't bother a bit if my proposal has been misunderstood so I try to keep it concise and right to the point. I'm just saying that it is not a reasonable question to ask in the first place — seems like the motivation for the proposal. Just to point out that you're not being fair.

@p-e-timoshenko
Copy link
Author

What programming languages support whitespaces in identifiers?
What are the goals?

I think, this feature can be used to name compatibility purposes with other programming languages.

  1. F# allows white space in identifier names, but they must be surrounded with double backticks.
  2. Ruby supports variable names with Unicode whitespace characters.
  3. Scala allows arbitrary identifiers using backticks.
  4. The TikZ language for creating graphics in LaTeX allows whitespace in parameter names (also known as 'keys'). For instance, you see things like
\shade[
  top color=yellow!70,
  bottom color=red!70,
  shading angle={45},
]

In this restricted setting of a comma-separated list of key-value pairs, there's no parsing difficulty. In fact, I think it's much easier to read than the alternatives like topColor, top_color or topcolor.

  1. In Javascript, you can write foo=bar, but what you've really said is:

window['foo'] = bar;

You could just as easily write

window['i haz a name'] = bar;

The various scopes in Coldfusion can also be treated as either a (dict|hash|associative array) or a name.

  1. TSQL will allow you to use whitespace in table and column names aslong as you have it between square braces [ ]

  2. FORTRAN compilers ignored spaces so:
    result = value * factor
    r e s u l t = val ue * fac tor
    result=value*factor`
    ...

@HaloFour
Copy link

@p-e-timoshenko The only language in that list that is somewhat in the C/C++ family of languages, at least in terms of syntax, is JavaScript, and in that case you're not permitted to have identifiers with whitespace in them. But since every object is effectively a dictionary you can have keys of any arbitrary string. You cannot declare anything which requires an identifier with an arbitrary name containing whitespace, such as a variable or class.

@bondsbw
Copy link

bondsbw commented Feb 12, 2016

@HaloFour Not many languages in the C/C++ family had lambda expressions before C#, that didn't mean it was a bad idea.

@HaloFour
Copy link

@bondsbw Lambda expressions in the language provide compelling use cases. Whitespace in the identifiers in the language does not. Anywho, seems the team has spoken and I doubt that the answer will change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants