Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add indent_method property for nontrivial indentation methods #38

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

florianb
Copy link
Member

@florianb florianb commented Jan 24, 2023

This PR aims to solve the requirement of editorconfig/editorconfig#323 to add support for smart_tabs.

I propose to add a generic indent_method property to be able to

  1. add support for non-trivial indentation methods
  2. allow a backward compatible definition of indentation methods. Therefore the indent_method may override indent size and style if required.

I added two common (?) methods for the beginning: smart_tabs and elastic_tabstops, the list may grow in the future.

I would prefer to also explain what smart_tabs and elastic_tabstops refer to, for this it would be necessary to write a small explanation to avoid the risk any referenced website goes down.

What do you think?

edit: renamed elastic_tabs to elastic_tabstops

@Naheulf
Copy link

Naheulf commented Jan 24, 2023

Both smart_tabs and elastic_tabs refer to an alignment property. So including indent in the property name seem a nonsense for me and seem a bad idea. Also, both values answer to the question "How to fill an already computed indent width?". But none of them say how to compute that width. That why including method in the property name also seem a bad idea.

As I wrote in editorconfig/editorconfig#323 (comment), there is already a property named indent_style to choose the style of the indentation. That why I suggested the name align_style to choose the style of the alignment.
An align_method or align_size property may be created later to answer the question "How to compute the with of the alignment.

As the property value smart_tabs mean "use spaces" (another nonsense), a better name for it in a specification should be smart_spaces or simply space as it just fill already computed indent width with spaces.

For example something like that.

   * - ``align_style``
     - Set to ``space`` or ``tab`` or ``elastic_tab`` to use a simple spaces or simple tabs or one elastic
       tabstops respectively (when supported). The tabstop value is defined by align_size and may override the ``indent_style``
       or ``indent_size`` settings as required.
   * - ``align_size``
     - Set to a whole number defining the number of columns used for each align level. If `align_style` equals `tabs` it should be a multiple of `tab_width`. If `align_style` equals `elastic_tabs` only one tab is used but the editor can stretch it up to that size. If it is set to `smart` the editor can use some heuristics to find the best width. The values are case insensitive.

With that:

  • the "smart_tabs" setting translate to indent_style=tab, align_style=space, align_size=smart (don't forgot that smart tabs mean use spaces for alignment)
  • the "elastic_tabs" setting translate to align_style=elastic_tab, align_size=smart.

Even with that some case remains:

  • are tabstops defined from the start of the line or the end of indentation?
  • how handle the case where fixed tabs are used but align_size does not match tab_size? Where should go the remaining width ?
  • minimal and maximal values for embedded alignment (variables after assignment operator, function args, end line comments, ...)

@florianb
Copy link
Member Author

florianb commented Jan 25, 2023

Thanks for your comment.

I proposed the property with the indent_ prefix to make it easier to comprehend that this setting might have side effects on or from the other indent-properties.

The proposal is also based on jedmaos comments (editorconfig/editorconfig#323 (comment)) where we got a common ground that any alignment might affect (ignore) any indentation settings.

I considered introducing some extra property as you describe here - i guess there won't be that many different alignment/indentation strategies justifying the complexity of adding two properties.

It is - in my opinion - unlikely that any of these "complex" indentation mechanisms are universally described and implemented by "simple" text editors. So this is anyways a setting for developers of an IDE (alike) or a plugin, so i guess everything they need to know is which strategy to apply.

If this assumption is true, our indent_method would be nothing more than a simple unified identifier for the given method to apply.

I don't see it as our business to actually specify how smart_tabs or whatsoever should work - that's why i would appreciate to create a small description which "just" holds enough information to get a clue what this setting might lead to. In any case we would have to reference to a further external resource for more information.

Using a slightly more general approach would allow us to add support for future indentation methods more easily.

@cowwoc
Copy link

cowwoc commented Jan 25, 2023

I agree with @florianb that we should keep the tone of the discussion civil. That said, I like @Naheulf's approach of separating indentation and alignment into two separate properties.

I'm 👍 for indent_style and align_style but 👎 for align_size. Not because I think align_size is necessarily a bad idea but because my gut feeling is that it might not generalize well to future alignment strategies. We could always add it in a future iteration.

I have two open questions:

  1. How often do editors use anything other than "smart" align_size? I've only ever seen "smart" getting used in real-life.
  2. What is the specified mechanism for handling backwards compatibility? Meaning, are we forced to add new properties ever time we want to add a new value to an existing property? Or does the specification state what editors are expected to do if they come across a value they do not support?

It seems that configuration files do not contain the version number of the specification, so editors have to decide on a per-property level whether they support the current specification or not. This seems a bit sloppy, unless you plan to introduce new properties every time you want to add values to existing properties. I'm not a fan of this approach.

@florianb
Copy link
Member Author

Good points @cowwoc - thank you.

I took the opportunity to do a little non-representative research on how often f.e. elastic tabstop support was requested. I got the following impression:

  1. Indentation/alignment techniques are often mixed up, even though "smart tabs" seems to be a more common label there exist additional names and variations of indentation attempts.
  2. Elastic tabstops are not widely used even though there is a userbase and there are prominent tools like tabwriter.

My whole point is about being able to add future alignment/indentation configs when the requirement arrives.

So to answer your second question: the editorconfig plugins shall ignore unrecognized keys and values. Outdated plugins will still use the indent-settings as defined.

There is a version number for the specification and implementing tools may reference to a specification-version.
And there might be a version property in the future, but i guess one success factor of the EditorConfig format is its simplicity.

Also one additional thought about the indentation vs alignment debate: i have still no substantial reason to call it align_ because in my opinion people searching for an option to adjust the subsequent "alignment" of code will likely search for way to "indent". MS VisualStudio, JetBrains they all describe "smart" tabs under the indentation topic.

@Naheulf
Copy link

Naheulf commented Jan 26, 2023

Sorry for my language. I edited my previous message to make it less hard.

The align_size property is only an example of possible additional property. I write it because I already see codes like that:

[some code before]
<indent>functionNameToCall(arg1,
<indent><align?>arg2
<indent><align?>arg3
<indent>)

In this example both <indent> and <align?> are placeholders. If I remember correctly the first was a multiple of three spaces while the later was a constant five space for all such align types. (Or does it count as special indent in that case?)

My whole point is about being able to add future alignment/indentation configs when the requirement arrives.

I didn't think about that in my previous message.

@florianb
Copy link
Member Author

florianb commented Jan 26, 2023

Dear @Naheulf - thank you very much for your apology, I have adjusted my answer as well.

Thank you also for the additional information - i get it now i guess (i sometimes need a bit).

Based on the recent "research" about the smart tabs capability i am thinking that the applied functionality falls into two categories:

  1. "Simple" smart indentation, where subsequent alignments are indented by a "tab width" using spaces. I guess that's nearly what you described:
indent__indent__my_function(arg1,
indent__indent__align__arg2,
indent__indent__align__arg3)

In this example the tab width is 7.

Then there is
2. "Semantical" smart indentation, where subsequent alignments are indented based on semantical information about the used programming language like this:

indent__indent__my_function(arg1,
indent__indent__align_______arg2,
indent__indent__align_______arg3)

I haven't tested this personally but the VS docs state on the indentation setting:

When [Smart is] selected, new lines are positioned to fit the code context, per other code formatting settings and IntelliSense conventions for your development language. This option is not available for all development languages.

This reads like the alignment is calculated.

I hope this behavior makes a specification of a dedicated align_size obsolete for the sake of simplicity. Maybe the tab_widthsetting might be reused for that requirement if needed?. Or we leave it for a future requirement?

@cowwoc
Copy link

cowwoc commented Jan 26, 2023

I have personally never seen anyone use tabs for alignment. Have you?

Every IDE I've seen defines "smart tabs" as "tabs for indentation, spaces for alignment" because alignment requires 1-character granularity and tabs don't provide that.

@florianb
Copy link
Member Author

florianb commented Jan 26, 2023

Not sure how this adds to the discussion but i have seen tabs being used for indentation as well as for alignment.

And there are editors and tool supporting this (f.e. ReSharper). I have no clue how many people use this and i don't feel obligated to fight for tabs for alignment.

All i say is that how things are getting aligned is a function known to the implementors of editors and plugins, and they calculate the number of spaces and whatsoever on their own - and i say that even for a label as widely spread as "smart tabs" there are different implementations of what it will do.

@cowwoc
Copy link

cowwoc commented Jan 26, 2023

Fair enough.

@Naheulf
Copy link

Naheulf commented Jan 26, 2023

From what I understand, elastics tab stops allow to use only one tab for alignment (instead of many tabs/spaces). With that we can do something like that: (example with the Notepad++'s ElasticTabstops plugin)
image
If someone change the LongArg2 for something else with a different length, all other arguments and comments keep getting aligned with no change on adjacent lines.

However the "magic" alignment does not allow to align number on coma:
image

Given that elastics tab-stops has almost no support in editors and very few people know about it, I didn't see any code with such alignment. (I didn't try to find through)

Edit: I use tab indent space align for my codes.

@cowwoc
Copy link

cowwoc commented Jan 27, 2023

Another approach we can consider is creating hierarchical properties for each kind of formatting style. I'm going to use dummy property names so please gloss over that details:

alignment=spaces

or

alignment=tabs
alignment.tab.size=2

or

alignment=smart_tabs
alignment.smart_tabs.tab_size = 4

or

alignment=elastic_tabs
alignment.elastic_tabs.whatever-property-would-be-relevant=...

This way we don't have to break our heads trying to come up with properties that would extend to future approaches. Only alignment would have to contain a unique value.

Again, the actual property names/values could be changed but the basic idea is that each mechanism would get its own unique properties without having to worry about other mechanisms.

@florianb
Copy link
Member Author

Okay both of you are having concerns about a missing align_size property.

Do you have a specific problem for which you need the align_size property at once?

@cowwoc
Copy link

cowwoc commented Jan 27, 2023

The only use-case that is driving me is "smart_tabs". From my perspective I don't really align_size at this point.

indent_size is fine for now as it already specifies the tab size. It's just not as readable or flexible as hierarchical properties, but that's not the end of the world.

@florianb
Copy link
Member Author

I see - there is already a indent_size and a tab_width setting so i just feel like i shouldn't light hearted run for a third size property.

There are thoughts going into that direction (f.e. see editorconfig/editorconfig#482) and i expect the whole standard to evolve more or less soon.

@darkstego
Copy link

I want to cast my vote to the align_type property. In addition to the space and tab properties, you can also have a follow_indent to mimic indent style.

To be honest, I don't see many use cases for anything other than space as an align type, except for code that use tabs and align trailing comments.

@zougloub
Copy link

zougloub commented Feb 16, 2023

I'd also like to use align for the name of that property too, as it differentiates between what is indentation and what is alignment. I guess its value should default to the value of the analogous indent property.

My personal style, which I'd like to be supported some day®, is to indent with tabs (it's the next best thing due to a missing native indentation glyph in ASCII), align with spaces, and by default I align with a single space (which is not standard "Smart Tabs", which considers the content of the previous line to align to a variable amount of spaces), but maybe more when it's more aesthetically meaningful.
For example:

class Demo:
▶️"""
▶️Demonstrate smart tabs indentation with single space continuation lines
▶️"""
▶️def __init__(
▶️ x, # single space continuation line alignment
▶️ y,
▶️):
▶️▶️self.data = some_function(
▶️▶️ c=(
▶️▶️    1, # aesthetic alignment with the one below
▶️▶️  + 1,
▶️▶️ ),
▶️▶️ d={
▶️▶️  "e": "f": {
▶️▶️   "g": "h",
▶️▶️  },
▶️▶️ },
▶️▶️ i=[
▶️▶️  [ 1,  2,  3],
▶️▶️  [-4, -5, -6], # numbers aligned too for more clarity
▶️▶️ ],
▶️▶️)
class Demo {
 private: # Some people happen to indent this (and goto labels) relative to the parent scope, not the current scope... and I align them
▶️int a;
▶️int b;
 public:
▶️Demo()
▶️ : a(1)
▶️ , b(2)
▶️{
▶️ here: 
▶️▶️if (false) {
▶️▶️▶️goto here;
▶️▶️}
▶️▶️
▶️▶️switch(a): {
▶️▶️ case 1:
▶️▶️▶️b = 3;
▶️▶️▶️break;
▶️▶️}
▶️}
};

So a config something like:

# Indent with tabs
indent = tab

# I don't care about tab size, and depending on my mood I will change it on the same file
#indent_size

# Align with spaces
align = space

# Fixed alignment (vs. "semantic" that aligns to after a function name or opening parenthesis)
align_method = fixed
align_size = 1
align_strict = false

# De-indent labels by one logical level
label_indentation = -1

I'm not a huge fan of "semantic" alignment to the end of a function name, as it shifts lines when a function name changes.
I don't think it's necessary for alignment to follow the same rules as alignment.

The "evil" effect of tabs on alignment of right-side comments when tab size varies doesn't bother me, the amount of code that is commented causing this (ie. right-side-commented across indentation levels), is pretty minimal.

I don't use it, but Elastic Tabs is also more an alignment method than an indentation method, with a tab having a varying width determined after an analysis of the surrounding text.

@xuhdev
Copy link
Member

xuhdev commented Mar 4, 2023

I personally agree that indent_method is a better name than smart_tab. However, I don't see how the two paths conflict. We can still add smart_tab to the spec and decide on indent_method later. It'll take some time to get editor plugins to implement smart_tab.

@florianb
Copy link
Member Author

Hey @xuhdev - i'd like to get this PR done.. :) I am not sure i understand your comment right, you vote for staying with indent_method or consider using align_type/method?

After that i try to figure out if we need a more detailed description to also reference the implications on former settings (like indent_style).

I am really thankful for everyone participating in this discussion 🙏 - i think we should make this feature available and we should primarily check if we block any future extension of the spec with this approach (which we are - in my opinion - not doing).

I still recommend stay with "intend" as the prefix since people will/might connect this to the handling of invisibles at the beginning of the line, and since this might have impact on other indent-settings it seems to be a good idea to make that clear with the setting-name as well.

This PR aims to solve the requirement of editorconfig/editorconfig#323 to add support for `smart_tabs`.

I propose to add a generic `indent_method` property to be able to
1. add support for non-trivial indentation methods
2. allow a backward compatible definition of indentation methods. Therefore the `indent_method` may override indent size and style if required.

I added two common (?) methods for the beginning: `smart_tabs` and `elastic_tabs`, the list may grow in the future.

I would prefer to also explain what `smart_tabs` and `elastic_tabs` refer to, for this it would be necessary to write a small explanation to avoid the risk any referenced website goes down.

What do you think?
@zougloub
Copy link

At the risk of sounding pedantic... indentation and alignment serve different purposes and are not interchangeable so this shouldn't be a coin toss kind of decision:

  • Indentation is moving a block of code or text more or less away from the first column, to indicate different levels of scope (or nesting... additional moving of braces enclosing a scope (eg. in GNU C style) is also considered indentation, and it's also possible to consider multi-line C++ lambda bodies as providing additional indentation). Basically it has to do with function bodies, control flow, braces (C), def/colon (Python), begin/end.
  • Alignment, is arrangement of code so that it "lines up". Unlike indentation, alignment does not usually indicate scope or nesting level. Instead, it often serves aesthetic or readability purposes. White-space padding of things enclosed by delimiters which aren't scoping braces have to do with alignment.

Key Differences

  • Purpose: Indentation is used to indicate blocks of code and their scope, while alignment is usually for readability and aesthetics.
  • Uniformity: Indentation is usually uniform within a block of code, whereas alignment can vary based on the length of the code or text around it.
  • Syntax Significance: In some languages like Python, indentation has syntactical significance, whereas alignment usually does not affect how code is executed.

A striking example is that it is possible to indent with tabs and align with spaces (which is ensuring that specific characters line up vertically in a column, regardless of how wide a tab is set to display, and allows for different individual preferences on tab width, which IMHO is quite flexible and respectful) and if it's not supported, it kind of means something's not right in the design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants